Skip to main content

33 posts tagged with "management"

View All Tags

Google Software Engineering: Software Development

· 6 min read

It is widely recognized that Google is a company with exceptional engineering capabilities. What are its best engineering practices? What insights can we gain from them? What aspects have drawn criticism? We will discuss these details gradually, with this article primarily focusing on development.

Google Software Engineering - Software Development

Codebase

  • As of 2015, there are 2 billion lines of code in a small number of Monorepo single codebases, with the vast majority of the code visible to everyone. Google encourages engineers to make changes when they see issues, as long as all reviewers approve, the changes can be integrated.
  • Almost all development occurs at the head of the codebase, rather than on branches, to avoid issues during merging and to facilitate safe fixes.
  • Every change triggers tests, and any errors are reported to the author and reviewers within minutes.
  • Each subtree of the codebase has at least two owners; other developers can submit modifications, but approval from the owners is required for integration.

Build System

  • The distributed build system Bazel makes compilation, linking, and testing easy and fast.
  • Hundreds or thousands of machines are utilized.
  • High reliability, with deterministic input dependencies leading to predictable output results, avoiding strange, uncertain fluctuations.
  • Fast. Once a build result is cached, dependent builds will directly use the cache without needing to recompile. Only the changed parts are rebuilt.
  • Pre-submit checks. Some quick tests can be executed before submission.

Code Review

  • There are code review tools in place.
  • All changes must undergo review.
  • After discovering a bug, you can point out the issue in the previous review, and relevant personnel will be notified via email.
  • Experimental code does not require mandatory review, but code in production must be reviewed.
  • Each change is encouraged to be as small as possible. Fewer than 100 lines is "small," fewer than 300 lines is "medium," fewer than 1000 lines is "large," and more than 1000 lines is "extremely large."

Testing

  • Unit tests
  • Integration tests, regression tests
  • Pre-submit checks
  • Automatic generation of test coverage
  • Conduct stress tests before deployment, generating relevant key metrics, especially latency and error rates as load varies.

Bug Tracking Tools

Bugs, feature requests, customer issues, processes, etc., are all recorded and need to be regularly triaged to confirm priorities and then assigned to the appropriate engineers.

Programming Languages

  • There are five official languages: C++, Java, Python, Go, JavaScript, to facilitate code reuse and collaborative development. Each language has a style guide.
  • Engineers undergo training in code readability.
  • Domain-specific languages (DSLs) are also unavoidable in certain contexts.
  • Data interaction between these languages primarily occurs through protocol buffers.
  • A common workflow is essential, regardless of the language used.

Debugging and Analysis Tools

  • When a server crashes, the crash information is automatically recorded.
  • Memory leaks are accompanied by the current heap objects.
  • There are numerous web tools to help you monitor RPC requests, change settings, resource consumption, etc.

Release

  • Most release work is performed by regular engineers.
  • Timely releases are crucial, as a fast release cadence greatly motivates engineers to work harder and receive feedback more quickly.
  • A typical release process includes:
    1. Finding the latest stable build, creating a release branch, possibly cherry-picking some minor changes.
    2. Running tests, building, and packaging.
    3. Deploying to a staging server for internal testing, where you can shadow online traffic to check for issues.
    4. Releasing to a canary environment to handle a small amount of traffic for public testing.
    5. Gradually releasing to all users.

Review of Releases

User-visible or significant releases must undergo reviews related to legal, privacy, security, reliability, and business requirements, ensuring relevant personnel are notified. There are dedicated tools to assist with this process.

Postmortem Reports

After a significant outage incident, the responsible parties must write a postmortem report, which includes:

  1. Incident title
  2. Summary
  3. Impact: duration, affected traffic, and profit loss
  4. Timeline: documenting the occurrence, diagnosis, and resolution
  5. Root causes
  6. What went well and what did not: what lessons can help others find and resolve issues more quickly and accurately next time?
  7. Next actionable items: what can be done to prevent similar incidents in the future?

Focus on the issue, not the person; the key here is to understand the problem itself and how to avoid similar issues in the future.

Code Rewrite

Large software systems are often rewritten every few years. The downside is the high cost, but the benefits include:

  1. Maintaining agility. Markets change, software requirements evolve, and code must adapt accordingly.
  2. Reducing complexity.
  3. Transferring knowledge to newcomers, giving them a sense of ownership.
  4. Enhancing engineer mobility and promoting cross-domain innovation.
  5. Adopting the latest technology stacks and methodologies.

My Comments

Google's single codebase and powerful build system are not easily replicable by small companies, as they lack the resources and capabilities to make their build systems as fast and agile. Staying small, simple, and fast allows small companies to operate more smoothly and focus more on core business logic.

Build systems are often customized, and your knowledge may not transfer or scale. A powerful build system can even be detrimental to newcomers, as it raises the cost of gaining a holistic view.

The inability to transfer and scale knowledge is also an issue with well-developed in-house tools. Throughout my career, I have tried to avoid using non-open-source internal tools, such as Uber's Schemaless, which are tailored for specific scenarios and not made public for broader use; in contrast, LinkedIn's Kafka is a good product with openness and scalability of knowledge.

In the open market, there are excellent tools available for the entire development, testing, integration, and release process. For example, in the JS community:

ProcessTools
CodebaseGithub, Gitlab, Bitbucket, gitolite
Code ReviewGithub Pull Requests, Phabricator
Pre-submit checks, testing, and lintinghusky, ava, istanbul, eslint, prettier
Bug TrackingGithub Issues, Phabricator
Testing and Continuous IntegrationCircleCI, TravisCI, TeamCity
DeploymentOnline service deployment with Heroku, Netlify, mobile app deployment with Fastlane, library publication with NPM

Finally, I may have an insight: companies that do not focus on the automation of these engineering processes will lose significant competitive advantage. I even set up a JS full-stack development framework OneFx for good engineering practices. The difference between fast and slow, high quality and low quality is often exponential because — typically, speed allows you to do more and faster, while poor quality leads to less and worse outcomes.

Charles Handy: The Second Curve

· 2 min read

When you know where you should go, it is too late to go there; if you always keep your original path, you will miss the road to the future.

Charles Handy makes an analogy as his road to Davy's Bar. Turn right and go up the hill when there is half a mile to the Davy's Bar. However, when he realized he was on the wrong way, he arrived at Davy's Bar already.

The growth curve is usually in an "S" shape, and we call it S-curve or sigmoid curve. To keep the overall growth rate high, you have to develop your second S-curve before it is too late to invest your time and resources.

Intel's CPU, Netflix's video streaming, Nintendo's gaming, Microsoft's cloud are all excellent examples of the second-curve-driving businesses.

How to find and catch the second curve takes vision and execution. You have to input more information and continuously sort them to identify the best opportunities. And then, once a chance identified, you need a reliable team to fight the battle and figure out whether it really works.

What makes you succeed may not make you succeed again. There is always a limit to growth. The second curve theory helps us reflect on why and how to embrace the change and live a more thriving life.

Charles Handy: The Second Curve

· 2 min read

When you know where to go, it is often too late; if you always stick to the original path, you will miss the road to the future.

Charles Handy illustrates this with the analogy of "David's Bar": on the way to "David's Bar," you should turn right up the hill when you are half a mile away. However, by the time he realized he was going the wrong way, he had already arrived at "David's Bar."

Growth curves are typically S-shaped, which we refer to as the S curve. To keep the growth rate consistently high, you must invest time and resources to develop a second S curve while there is still time.

Intel's CPUs, Netflix's video streaming, Nintendo's games, and Microsoft's cloud services are all excellent examples of businesses driven by this second curve.

How can you discover and seize the second curve? You need to input more information, discern good from bad, and identify opportunities. Then, once the opportunity arises, having a strong team to tackle the hard work is essential to determine whether you have truly found the second curve.

The reasons that made you successful in the past may not lead to future success; growth always has its limits. The second curve theory helps us reflect on why and how to embrace change for a better life.

How to Motivate Employees?

· 2 min read

Motivation and incentives are at the core of performance management. Without motivation, employees lack the drive to perform well, making all feedback and training efforts futile.

The Respect from Leaders is Correlated with Employee Motivation

Offensive behavior can directly undermine employee motivation and performance, so managers need to curb such behavior by:

  1. Leading by example.
  2. Upholding employees' dignity. Public praise, private criticism.
  3. Hiring respectful employees and not tolerating bad behavior. Address feedback issues promptly.

Incentives Primarily Come from Two Aspects: Extrinsic and Intrinsic

  1. Extrinsic rewards—money (promotions, raises, bonuses)

    1. These rewards do not necessarily enhance employee performance.
    2. Their effects are usually short-lived.
    3. It is often difficult to distinguish individual contributions within a team, and what constitutes an appropriate reward varies for everyone. In fact, most employees' primary concern is fairness; when providing monetary rewards, it is crucial to ensure fairness and consistency.
  2. Intrinsic rewards—satisfaction (a sense of achievement, control, appreciation, intellectual growth, skill enhancement, autonomy, and overcoming challenges)

    1. It is essential to note that these rewards should be tailored to the individual.

How to Provide Intrinsic Rewards?

  1. Recognize their work. "The key to recognition is making people feel unique." If everyone receives the same recognition, no one will feel special.

    1. Different individuals value recognition sources differently. From colleagues? Publicly praise them in front of peers. From clients? Share a thank-you note from a client. From the profession? Award professional accolades. From the boss? Describe their importance to the team vividly during one-on-ones.
    2. Tailor recognition to personality. Introverted or extroverted? Public or private? If unsure, ask them directly.
    3. Recognition frequency should be high, at least once every two weeks.
    4. Handwritten notes are low-cost but highly effective rewards.
  2. Provide decision-making authority.

    1. People enjoy having a sense of ownership and control.
  3. Offer challenges.

    1. The greater the challenge, the higher the sense of achievement upon completion.
    2. Provide opportunities to undertake tasks they haven't done before, helping them develop new skills. Note that they should have relevant talents and skills, rather than starting from scratch.

Task-Related Maturity

· One min read

Andy Grove emphasizes: ==The most important responsibility of a manager is to inspire their subordinates to perform at their best==.

Unfortunately, there is no single management style that fits everyone in all situations. The fundamental variable in finding the best management style is the task-related maturity (TRM) of the subordinates.

Subordinate's Work MaturityEffective Leadership Style
LowOrganized; Task-oriented; Detail-focused; Accurately points out the details of "when - what - how"
MediumPeople-oriented; Provides support; "Two-way communication" model
HighGoal-oriented "monitoring" model

A person's task-related maturity depends on the specific work project, and its improvement takes time. When task-related maturity reaches its highest level, the individual's knowledge and motivation will also reach a certain height, allowing their manager to successfully delegate work to them.

The key takeaway is: ==There is no good or bad management style; there is only effective and ineffective==.

Task-Relevant Maturity

· One min read

Andy Grove emphasizes that ==a manager’s most important responsibility is to elicit top performance from his subordinates.==.

Unfortunately, one management style does not fit all the people in all the scenarios. A fundamental variable to find the best management style is task-relevant maturity (TRM) of the subordinates.

TRMEffective Management Style
lowstructured; task-oriented; detailed-oriented; instruct exactly "what/when/how mode"
mediumIndividual-oriented; support, "mutual-reasoning mode"
highgoal-oriented; monitoring mode

A person's TRM depends on the specific work items. It takes time to improve. When TRM reaches the highest level, the person's both knowledge-level and motivation are ready for her manager to delegate work.

The key here is to regard any management mode not as either good or bad but rather as effective or not effective.

Time Management for System Administrators: Fundamental Principles

· 2 min read

Learning time management from system administrators (SAs) is an inspiring experience, as we all face the same challenges—endless interruptions, concurrent projects, and sudden demands.

Moreover, system administrators must deal with these issues even more frequently, as Thomas Limoncelli puts it:

For system administrators, your boss evaluates you based on whether you complete projects, while your clients only care about whether you can meet their demands on time.

Here are the time management rules for SAs.

  • Interruptions are the biggest enemy of productivity.

    • Establish a "disruption shield" shift mechanism with colleagues to ensure that only one person can be distracted at a time.
    • Set aside large blocks of dedicated time for projects.
    • Close the office door (of course, if you are a manager, don’t do this).
    • Have junior engineers sit outside your office to filter out 80% of the distractions.
  • Consolidate all time management information in one place.

  • Save mental energy for important tasks.

  • Don’t constantly think about how to manage time; instead, develop routines, habits, and mantras.

    • Routines are a series of predefined steps that occur within a specific timeframe.
    • Habits are actions that people can perform without thinking.
    • Mantras are simple rules of thumb.
  • Maintain focus during projects, but this requires good self-discipline.

    • Self-discipline enhances self-esteem. Self-esteem is like poker chips. When we have higher self-esteem, we tend to place higher bets to win bigger rewards.
  • Use the same tools to manage your social life.

From Good to Great

· One min read

Leading a company from good to great is equivalent to driving a massive flywheel to achieve ==breakthroughs==

  1. Disciplined and well-trained people
    1. Level 5 Leadership: Great leaders > Effective leaders > Competent managers > Contributing team members > Capable individuals
    2. First Who, Then What
  2. Disciplined and well-trained thoughts
    1. Confront the brutal facts
    2. Be a hedgehog first, then a fox
  3. Disciplined and well-trained actions
    1. A culture of discipline
    2. Technology accelerates the engine of growth

Good to great

· One min read

Leading a company to leap from good to great = pushing a giant flywheel to ==breakthrough== with

  1. Disciplined People
    1. Level 5 leadership: executive > effective leader > competent manager > contributing team member > highly capable individual
    2. First who then what
  2. Disciplined Thought
    1. Confront the brutal facts
    2. Be a fox after being a hedgehog
  3. Disciplined Action
    1. Culture of discipline
    2. Technology accelerators

Managerial Leverage

· 5 min read

Why introducing leverage to management?

Maximizing organization’s output.

A manager’s output = The output of his organization + The output of the neighboring organizations under his influence

This means that if a manager is not just a hierarchical supervisor but also a know-how manager (knowledge supplier), then he will have larger impact on both his own organization as well as neighboring organizations.

the definition of “manager” should be broadened: individual contributors who gather and disseminate know-how and information should also be seen as middle managers, because they exert great power within the organization.

What to be leveraged? Managerial activities

  1. ==Information gathering== - the basis of all other managerial work

    1. Verbal sources are the most valuable because usually the more timely the information, the more valuable it is., but what they provide is also sketchy, incomplete, and sometimes inaccurate, like a newspaper headline that can give you only the general idea of a story.
    2. Reports are more a medium of self-discipline than a way to communicate information. Writing the report is important; reading it often is not.
    3. to visit a particular place in the company and observe what’s going on there. For example, we ask our managers to participate in “Mr. Clean” inspections, in which they go to a part of the company that they normally wouldn’t visit. The managers examine the housekeeping, the arrangement of things, the labs, and the safety equipment, and in so doing spend an hour or so browsing around and getting acquainted with things firsthand.
  2. ==Information-giving==

  3. ==Decision-making==, includes 2 kinds

    1. forward-looking
    2. respond to a developing problem or a crisis
  4. ==Nudging - advocating a preferred course of action, but you are not issuing a firm and detailed instruction. == it should be carefully distinguished from decision-making that results in firm, clear directives.

  5. ==Being a role model==. nothing leads as well as example. Values and behavioral norms are simply not transmitted easily by talk or memo, but are conveyed very effectively by doing and doing visibly.

By and large, none of the above can happen without a meeting. However, meeting is not an activity, it is an occasion or medium where activity happens.

What are managerial leverages, exactly?

Managerial Output = Output of organization = L1 × A1 + L2 × A2 +…

To maximize the output...

  1. Speeding up
    1. Managerial output / time = L * (activity performed) / time
  2. Increasing leverage
  3. Shifting activities to those with higher leverage

Leverage can be increased....

  1. When many people are affected
  2. When a person’s behavior is impacted in the long run
  3. When a large group’s work is affected by the unique information

For example,

  1. Positive leverage

    1. planing in advance, and a large group of people know what to do.
    2. Timely reaction to a subordinate’s intention to quit
    3. Imparting knowledge, skills, or values to a group
    4. Activities that takes short time but affects another person’s performance over a long time. performance review.
    5. ==creating a tickler file==
    6. Providing unique skills and knowledge.
  2. Negative leverage

    1. I am a key participant at a meeting and I arrive unprepared.
    2. ==Spreading depression==
    3. ==Waffling==
    4. ==Managerial meddling.== if a senior manager sees an indicator showing an undesirable trend and dictates to the person responsible a detailed set of actions to be taken, that is managerial meddling.

Shifting activities to those with higher leverage by DELEGATION

  1. The “delegator” and “delegatee” must share a common information base and a common set of operational ideas or notions on how to go about solving problems

  2. Being conscious

  3. ==delegation without follow-through is abdication.== how to monitor? QA

    1. When? ==at the lowest-added-value stage of the process==
    2. frequency? According to the worker’s experience level and sampling results
    3. Coverage? Not 100% but random sampling
  4. How to monitor the delegation of decision-making?

    1. Inspect the decision-making process by a review meeting and ask specific questions.
  5. How Many Subordinates Should You Have?

    1. As a rule of thumb, a manager whose work is largely supervisory should have six to eight subordinates, because
      1. a manager should allocate about a half day per week to each of his subordinates.
      2. Even if he works without a single subordinate, servicing a number of varied “customers” as an internal consultant can in itself be a full-time job.
    2. Hence second-line manager should have IC reporting to them if he does not have enough subordinates

Speeding up

  1. time-management

    1. For example
      1. to handle a piece of paper only once,

      2. to hold only stand-up meetings (which will presumably be short), and

      3. to turn his desk so that he presents his back to the door.

    2. Principles
      1. identify our limiting step. schedule my other work around this limiting step.
      2. batching similar tasks
      3. Using calendar as the medium of forecast / planning
        1. Actively fill the holes between the time-critical events with non-time-critical though necessary activities.
        2. ==say “no” at the outset to work beyond your capacity to handle. Remember too that your time is your one finite resource, and when you say “yes” to one thing you are inevitably saying “no” to another.==
      4. Slack - a bit of looseness in your scheduling.
      5. Keeping an inventory of projects
      6. Standardize procedures
  2. reducing Interruptions (the plague of managerial work) — how to solve? regularity and smoothing out workload.

    1. Controlled way
      • Batching same block of time used for like activities / all schedule meetings at the same time
    2. Uncontrolled way
      • hiding physically is not practical because legitimate problems will pile up
      • Prepare FAQ/documentations
      • Batching
      • Providing alternatives:
        • “I am doing individual work. Please don’t interrupt me unless it really can’t wait until 2: 00.”
        • Office hours