Skip to main content

34 posts tagged with "management"

View All Tags

Google's Software Engineering: Project Management

· 3 min read

20% Time

Engineers are allowed to spend 20% of their work time on any project they want to contribute to, without needing approval from their managers or others. This is highly valuable because:

  1. As long as there are good ideas, no matter how bad they sound at first, there is ample time to develop them to a demo-ready state.
  2. It allows managers to see activities they might not otherwise notice; otherwise, engineers might engage in "skunkworks" and work secretly.
  3. It enables engineers to work on interesting projects, preventing burnout and motivating them to be happier. The output gap between motivated engineers and burnt-out engineers far exceeds 20%.
  4. It encourages innovation; if others around you are working on 20% projects, you will be inspired to do the same.

OKRs

Individuals and teams must publicly document their objectives and how they measure them.

  • Objectives
    • Set quarterly and annual goals.
    • Individual and team goals should align with the larger group’s goals.
  • Key Results: Measurable key results can quantify progress towards objectives, ranging from 0 to 1.
  • Set OKRs high; generally, achieving around 0.65 is a good standard. If your results are often below this, your goals may be set too high; if above, they may be too low.
  • Benefits
    • Everyone knows what others are working on, fostering mutual motivation.
    • Provides purpose to execution, making it easier to achieve goals.
  • OKRs are not directly related to performance evaluations.

Should the Project Continue or Be Terminated?

While the review process for major new releases is systematic, there is no definitive answer to whether a project should continue; some decisions are bottom-up, while others are top-down.

Reorganization

Splitting and merging teams is common, seemingly optimizing efficiency.

My Evaluation

The results of 20% time are positive, having incubated significant projects like Gmail and AdSense. In a competitive environment, encouraging talented engineers to spend time on new initiatives is highly beneficial. Promoting 20% time is also a unique strategy to attract talent when the company is small and needs to offer excellent benefits. I tend to view 20% time as a management style rather than a guaranteed path to success.

The distinction between OKRs and performance evaluations is crucial—this means separating vision from execution and goal management from performance management. For example, asking "Did you reach the destination?" compared to "Is the car you drove a good one?" are two different questions. Similarly, poor product sales and whether engineers produced a good product are two separate issues.

For regular engineers, maintaining good relationships with other teams in a large company, including those unrelated to your specific work, is important, as it increases your demand in the labor market. This way, in the event of a reorganization or other adverse events, you will have more options.

Google Software Engineering: Software Development

· 6 min read

It is widely recognized that Google is a company with exceptional engineering capabilities. What are its best engineering practices? What insights can we gain from them? What aspects have drawn criticism? We will discuss these details gradually, with this article primarily focusing on development.

Google Software Engineering - Software Development

Codebase

  • As of 2015, there are 2 billion lines of code in a small number of Monorepo single codebases, with the vast majority of the code visible to everyone. Google encourages engineers to make changes when they see issues, as long as all reviewers approve, the changes can be integrated.
  • Almost all development occurs at the head of the codebase, rather than on branches, to avoid issues during merging and to facilitate safe fixes.
  • Every change triggers tests, and any errors are reported to the author and reviewers within minutes.
  • Each subtree of the codebase has at least two owners; other developers can submit modifications, but approval from the owners is required for integration.

Build System

  • The distributed build system Bazel makes compilation, linking, and testing easy and fast.
  • Hundreds or thousands of machines are utilized.
  • High reliability, with deterministic input dependencies leading to predictable output results, avoiding strange, uncertain fluctuations.
  • Fast. Once a build result is cached, dependent builds will directly use the cache without needing to recompile. Only the changed parts are rebuilt.
  • Pre-submit checks. Some quick tests can be executed before submission.

Code Review

  • There are code review tools in place.
  • All changes must undergo review.
  • After discovering a bug, you can point out the issue in the previous review, and relevant personnel will be notified via email.
  • Experimental code does not require mandatory review, but code in production must be reviewed.
  • Each change is encouraged to be as small as possible. Fewer than 100 lines is "small," fewer than 300 lines is "medium," fewer than 1000 lines is "large," and more than 1000 lines is "extremely large."

Testing

  • Unit tests
  • Integration tests, regression tests
  • Pre-submit checks
  • Automatic generation of test coverage
  • Conduct stress tests before deployment, generating relevant key metrics, especially latency and error rates as load varies.

Bug Tracking Tools

Bugs, feature requests, customer issues, processes, etc., are all recorded and need to be regularly triaged to confirm priorities and then assigned to the appropriate engineers.

Programming Languages

  • There are five official languages: C++, Java, Python, Go, JavaScript, to facilitate code reuse and collaborative development. Each language has a style guide.
  • Engineers undergo training in code readability.
  • Domain-specific languages (DSLs) are also unavoidable in certain contexts.
  • Data interaction between these languages primarily occurs through protocol buffers.
  • A common workflow is essential, regardless of the language used.

Debugging and Analysis Tools

  • When a server crashes, the crash information is automatically recorded.
  • Memory leaks are accompanied by the current heap objects.
  • There are numerous web tools to help you monitor RPC requests, change settings, resource consumption, etc.

Release

  • Most release work is performed by regular engineers.
  • Timely releases are crucial, as a fast release cadence greatly motivates engineers to work harder and receive feedback more quickly.
  • A typical release process includes:
    1. Finding the latest stable build, creating a release branch, possibly cherry-picking some minor changes.
    2. Running tests, building, and packaging.
    3. Deploying to a staging server for internal testing, where you can shadow online traffic to check for issues.
    4. Releasing to a canary environment to handle a small amount of traffic for public testing.
    5. Gradually releasing to all users.

Review of Releases

User-visible or significant releases must undergo reviews related to legal, privacy, security, reliability, and business requirements, ensuring relevant personnel are notified. There are dedicated tools to assist with this process.

Postmortem Reports

After a significant outage incident, the responsible parties must write a postmortem report, which includes:

  1. Incident title
  2. Summary
  3. Impact: duration, affected traffic, and profit loss
  4. Timeline: documenting the occurrence, diagnosis, and resolution
  5. Root causes
  6. What went well and what did not: what lessons can help others find and resolve issues more quickly and accurately next time?
  7. Next actionable items: what can be done to prevent similar incidents in the future?

Focus on the issue, not the person; the key here is to understand the problem itself and how to avoid similar issues in the future.

Code Rewrite

Large software systems are often rewritten every few years. The downside is the high cost, but the benefits include:

  1. Maintaining agility. Markets change, software requirements evolve, and code must adapt accordingly.
  2. Reducing complexity.
  3. Transferring knowledge to newcomers, giving them a sense of ownership.
  4. Enhancing engineer mobility and promoting cross-domain innovation.
  5. Adopting the latest technology stacks and methodologies.

My Comments

Google's single codebase and powerful build system are not easily replicable by small companies, as they lack the resources and capabilities to make their build systems as fast and agile. Staying small, simple, and fast allows small companies to operate more smoothly and focus more on core business logic.

Build systems are often customized, and your knowledge may not transfer or scale. A powerful build system can even be detrimental to newcomers, as it raises the cost of gaining a holistic view.

The inability to transfer and scale knowledge is also an issue with well-developed in-house tools. Throughout my career, I have tried to avoid using non-open-source internal tools, such as Uber's Schemaless, which are tailored for specific scenarios and not made public for broader use; in contrast, LinkedIn's Kafka is a good product with openness and scalability of knowledge.

In the open market, there are excellent tools available for the entire development, testing, integration, and release process. For example, in the JS community:

ProcessTools
CodebaseGithub, Gitlab, Bitbucket, gitolite
Code ReviewGithub Pull Requests, Phabricator
Pre-submit checks, testing, and lintinghusky, ava, istanbul, eslint, prettier
Bug TrackingGithub Issues, Phabricator
Testing and Continuous IntegrationCircleCI, TravisCI, TeamCity
DeploymentOnline service deployment with Heroku, Netlify, mobile app deployment with Fastlane, library publication with NPM

Finally, I may have an insight: companies that do not focus on the automation of these engineering processes will lose significant competitive advantage. I even set up a JS full-stack development framework OneFx for good engineering practices. The difference between fast and slow, high quality and low quality is often exponential because — typically, speed allows you to do more and faster, while poor quality leads to less and worse outcomes.

Charles Handy: The Second Curve

· 2 min read

When you know where you should go, it is too late to go there; if you always keep your original path, you will miss the road to the future.

Charles Handy makes an analogy as his road to Davy's Bar. Turn right and go up the hill when there is half a mile to the Davy's Bar. However, when he realized he was on the wrong way, he arrived at Davy's Bar already.

The growth curve is usually in an "S" shape, and we call it S-curve or sigmoid curve. To keep the overall growth rate high, you have to develop your second S-curve before it is too late to invest your time and resources.

Intel's CPU, Netflix's video streaming, Nintendo's gaming, Microsoft's cloud are all excellent examples of the second-curve-driving businesses.

How to find and catch the second curve takes vision and execution. You have to input more information and continuously sort them to identify the best opportunities. And then, once a chance identified, you need a reliable team to fight the battle and figure out whether it really works.

What makes you succeed may not make you succeed again. There is always a limit to growth. The second curve theory helps us reflect on why and how to embrace the change and live a more thriving life.

Charles Handy: The Second Curve

· 2 min read

When you know where to go, it is often too late; if you always stick to the original path, you will miss the road to the future.

Charles Handy illustrates this with the analogy of "David's Bar": on the way to "David's Bar," you should turn right up the hill when you are half a mile away. However, by the time he realized he was going the wrong way, he had already arrived at "David's Bar."

Growth curves are typically S-shaped, which we refer to as the S curve. To keep the growth rate consistently high, you must invest time and resources to develop a second S curve while there is still time.

Intel's CPUs, Netflix's video streaming, Nintendo's games, and Microsoft's cloud services are all excellent examples of businesses driven by this second curve.

How can you discover and seize the second curve? You need to input more information, discern good from bad, and identify opportunities. Then, once the opportunity arises, having a strong team to tackle the hard work is essential to determine whether you have truly found the second curve.

The reasons that made you successful in the past may not lead to future success; growth always has its limits. The second curve theory helps us reflect on why and how to embrace change for a better life.

How to Motivate Employees?

· 2 min read

Motivation and incentives are at the core of performance management. Without motivation, employees lack the drive to perform well, making all feedback and training efforts futile.

The Respect from Leaders is Correlated with Employee Motivation

Offensive behavior can directly undermine employee motivation and performance, so managers need to curb such behavior by:

  1. Leading by example.
  2. Upholding employees' dignity. Public praise, private criticism.
  3. Hiring respectful employees and not tolerating bad behavior. Address feedback issues promptly.

Incentives Primarily Come from Two Aspects: Extrinsic and Intrinsic

  1. Extrinsic rewards—money (promotions, raises, bonuses)

    1. These rewards do not necessarily enhance employee performance.
    2. Their effects are usually short-lived.
    3. It is often difficult to distinguish individual contributions within a team, and what constitutes an appropriate reward varies for everyone. In fact, most employees' primary concern is fairness; when providing monetary rewards, it is crucial to ensure fairness and consistency.
  2. Intrinsic rewards—satisfaction (a sense of achievement, control, appreciation, intellectual growth, skill enhancement, autonomy, and overcoming challenges)

    1. It is essential to note that these rewards should be tailored to the individual.

How to Provide Intrinsic Rewards?

  1. Recognize their work. "The key to recognition is making people feel unique." If everyone receives the same recognition, no one will feel special.

    1. Different individuals value recognition sources differently. From colleagues? Publicly praise them in front of peers. From clients? Share a thank-you note from a client. From the profession? Award professional accolades. From the boss? Describe their importance to the team vividly during one-on-ones.
    2. Tailor recognition to personality. Introverted or extroverted? Public or private? If unsure, ask them directly.
    3. Recognition frequency should be high, at least once every two weeks.
    4. Handwritten notes are low-cost but highly effective rewards.
  2. Provide decision-making authority.

    1. People enjoy having a sense of ownership and control.
  3. Offer challenges.

    1. The greater the challenge, the higher the sense of achievement upon completion.
    2. Provide opportunities to undertake tasks they haven't done before, helping them develop new skills. Note that they should have relevant talents and skills, rather than starting from scratch.

Task-Related Maturity

· One min read

Andy Grove emphasizes: ==The most important responsibility of a manager is to inspire their subordinates to perform at their best==.

Unfortunately, there is no single management style that fits everyone in all situations. The fundamental variable in finding the best management style is the task-related maturity (TRM) of the subordinates.

Subordinate's Work MaturityEffective Leadership Style
LowOrganized; Task-oriented; Detail-focused; Accurately points out the details of "when - what - how"
MediumPeople-oriented; Provides support; "Two-way communication" model
HighGoal-oriented "monitoring" model

A person's task-related maturity depends on the specific work project, and its improvement takes time. When task-related maturity reaches its highest level, the individual's knowledge and motivation will also reach a certain height, allowing their manager to successfully delegate work to them.

The key takeaway is: ==There is no good or bad management style; there is only effective and ineffective==.

Task-Relevant Maturity

· One min read

Andy Grove emphasizes that ==a manager’s most important responsibility is to elicit top performance from his subordinates.==.

Unfortunately, one management style does not fit all the people in all the scenarios. A fundamental variable to find the best management style is task-relevant maturity (TRM) of the subordinates.

TRMEffective Management Style
lowstructured; task-oriented; detailed-oriented; instruct exactly "what/when/how mode"
mediumIndividual-oriented; support, "mutual-reasoning mode"
highgoal-oriented; monitoring mode

A person's TRM depends on the specific work items. It takes time to improve. When TRM reaches the highest level, the person's both knowledge-level and motivation are ready for her manager to delegate work.

The key here is to regard any management mode not as either good or bad but rather as effective or not effective.

Time Management for System Administrators: Fundamental Principles

· 2 min read

Learning time management from system administrators (SAs) is an inspiring experience, as we all face the same challenges—endless interruptions, concurrent projects, and sudden demands.

Moreover, system administrators must deal with these issues even more frequently, as Thomas Limoncelli puts it:

For system administrators, your boss evaluates you based on whether you complete projects, while your clients only care about whether you can meet their demands on time.

Here are the time management rules for SAs.

  • Interruptions are the biggest enemy of productivity.

    • Establish a "disruption shield" shift mechanism with colleagues to ensure that only one person can be distracted at a time.
    • Set aside large blocks of dedicated time for projects.
    • Close the office door (of course, if you are a manager, don’t do this).
    • Have junior engineers sit outside your office to filter out 80% of the distractions.
  • Consolidate all time management information in one place.

  • Save mental energy for important tasks.

  • Don’t constantly think about how to manage time; instead, develop routines, habits, and mantras.

    • Routines are a series of predefined steps that occur within a specific timeframe.
    • Habits are actions that people can perform without thinking.
    • Mantras are simple rules of thumb.
  • Maintain focus during projects, but this requires good self-discipline.

    • Self-discipline enhances self-esteem. Self-esteem is like poker chips. When we have higher self-esteem, we tend to place higher bets to win bigger rewards.
  • Use the same tools to manage your social life.

From Good to Great

· One min read

Leading a company from good to great is equivalent to driving a massive flywheel to achieve ==breakthroughs==

  1. Disciplined and well-trained people
    1. Level 5 Leadership: Great leaders > Effective leaders > Competent managers > Contributing team members > Capable individuals
    2. First Who, Then What
  2. Disciplined and well-trained thoughts
    1. Confront the brutal facts
    2. Be a hedgehog first, then a fox
  3. Disciplined and well-trained actions
    1. A culture of discipline
    2. Technology accelerates the engine of growth

Good to great

· One min read

Leading a company to leap from good to great = pushing a giant flywheel to ==breakthrough== with

  1. Disciplined People
    1. Level 5 leadership: executive > effective leader > competent manager > contributing team member > highly capable individual
    2. First who then what
  2. Disciplined Thought
    1. Confront the brutal facts
    2. Be a fox after being a hedgehog
  3. Disciplined Action
    1. Culture of discipline
    2. Technology accelerators