Skip to main content

ACID vs BASE

· One min read

ACID (Consistency over Availability)

  • Atomicity ensures transaction succeeds completely or fails completely.
  • Consistency: In ACID, the C means that a transaction pre-serves all the database rules, such as unique keys, triggers, cascades. In contrast, the C in CAP refers only to single copy consistency, a strict subset of ACID consistency.
  • Isolation ensures that concurrent execution of transactions leaves the database in the same state that would have been obtained if the transactions were executed sequentially.
  • Durability ensures that once a transaction has been committed, it will remain committed even in the case of a system failure (e.g. power outage or crash).

BASE (Availability over Consistency)

  • Basically available indicates that the system is guaranteed to be available
  • Soft state indicates that the state of the system may change over time, even without input. This is mainly due to the eventually consistent model.
  • Eventual consistency indicates that the system will become consistent over time, given that the system doesn't receive input during that time.

Although most NoSQL takes BASE priciples, they are learning from or moving toward ACID. e.g. Google Spanner provides strong consistency. MongoDB 4.0 adds support for multi-document ACID transactions.

Will Larson's Lessons from Digg v4 catastrophic launch

· One min read

Digg v3.5 to v4 catastrophic launch was actually pretty ambitious. Digg had been devastated by Google's Panda algorithm update. And Launching v4 was their chance to return to their rightful place among the giants of the Internet.

So there is no rollback plan and unnexpected scale problems occurred.

  1. there was Cassandra bottleneck -> implemented write-through-cache memcache
  2. however, MyNews page was still broken every four hours
  3. rewrote MyNews in Redis and keep deleting the excess data in secret to keep the site running
  4. it took a month to track down the bug in the Python tornado backend service, and there is some kind of API that uses mutable default value as an argument like
def get_user_by_names_or_ids(names=[], ids=[])

This causes memory leak - If you mutate those params, the mutations span across invocations. Accumulated data even blows up the memcache clusters.

Replica, Consistency, and CAP theorem

· 2 min read

Why replica and consistency?

large dataset ⟶ scale out ⟶ data shard / partition ⟶ 1) routing for data access 2) replica for availability ⟶ consistency challenge

Consistency trade-offs for CAP theorem

CAP Theorem

  • Consistency: all nodes see the same data at the same time
  • Availability: a guarantee that every request receives a response about whether it succeeded or failed
  • Partition tolerance: system continues to operate despite arbitrary message loss or failure of part of the system

Any networked shared-data system can have only two of three desirable properties.

  • rDBMS prefer CP ⟶ ACID
  • NoSQL prefer AP ⟶ BASE

"2 of 3" is mis-leading

12 years later, the author Eric Brewer said "2 of 3" is mis-leading, because

  1. partitions are rare, there is little reason to forfeit C or A when the system is not partitioned.
  2. choices can be actually applied to multiple places within the same system at very fine granularity.
  3. choices are not binary but with certain degrees.

Consequently, when there is no partition (nodes are connected correctly), which is often the case, we should have both AC. When there are partitions, deal them with 3 steps:

  1. detect the start of a partition,
  2. enter an explicit partition mode that may limit some operations, and
  3. initiate partition recovery (compensate for mistakes) when communication is restored.

Load Balancer Types

· One min read

Generally speaking, load balancers fall into three categories:

  • DNS Round Robin (rarely used): clients get a randomly-ordered list of IP addresses.
    • pros: easy to implement and free
    • cons: hard to control and not responsive, since DNS cache needs time to expire
  • Network (L3/L4) Load Balancer: traffic is routed by IP address and ports.L3 is network layer (IP). L4 is session layer (TCP).
    • pros: better granularity, simple, responsive
  • Application (L7) Load Balancer: traffic is routed by what is inside the HTTP protocol. L7 is application layer (HTTP).

B tree vs. B+ tree

· One min read

B tree vs. B+ tree

Pros of B tree

  • Data associated with each key ⟶ frequently accessed nodes can lie closer to the root, and therefore can be accessed more quickly.

Pros of B+ tree

  • No data associated in the interior nodes ⟶ more keys in memory ⟶ fewer cache misses
  • Leaf nodes of B+ trees are linked ⟶ easier to traverse ⟶ fewer cache misses

Experience Deep Dive

· 4 min read

Intended Audience

Moderate experience or less, or anyone who was not in a leadership or design position (either formal or informal) in their previous position

Question Description

Describe one of your previous projects that was particularly interesting or memorable to you. Followup questions:

What about it made it interesting? What was the most challenging part of the project, and how did you address those challenges? What did you learn from the project, and what do you wish you had known before you started? What other designs/implementation methods did you consider? Why did you choose the one that you did? If you were to do the same project over again, what would you do differently?

Interviewer tips

Since the goal here is to assess the technical communication skill and interest level of someone who has not necessarily ever been in a role that they could conduct a crash course in, you should be prepared to keep asking them questions (either for more details, or about other aspects of the project). If they are a recent grad and did a thesis, that's often a good choice to talk about. While this question is in many ways similar to the Resume Questions question from phone screen one, this question is intended to be approximately four times as long, and should get into proportionally more detail about what it is that they have done. As such, the scoring criteria are similar, but should be evaluated with both higher expectations and more data.

Scoring

A great candidate will

  • Be able to talk for the full time about the project, with interaction from the interviewer being conversational rather than directing

  • Be knowledgeable about the project as a whole, rather than only their area of focus, and be able to articulate the intent and design of the project

  • Be passionate about whatever the project was, and able to describe the elements of the project that inspired that passion clearly

  • Be able to clearly explain what alternatives were considered, and why they chose the implementation strategy that they did.

  • Have reflected on and learned from their experiences

A good candidate will

  • May have some trouble talking for the full time, but will be able to with some help and questions from the interviewer

  • May lack some knowledge about the larger scope of the project, but still have strong knowledge of their particular area and pieces that directly interacted with them

  • May seem passionate, but be unable to clearly explain what inspired that passion

  • May be able to discuss alternatives to what they did, but not have considered them in depth

  • Have reflected on and learned from their experiences

A bad candidate will

  • Have difficulty talking for the full time. The interviewer may feel as if they are interrogating rather than conversing with the candidate

  • May lack detailed knowledge of the project, even within the area that they were working. They may not understand why their piece was designed the way it was, or may not understand how it interacted with other systems

  • Does not seem very interested in the project - remember that you are asking them about the most interesting project that they have done, they should be very - interested in whatever it was

  • May not be familiar with potential alternatives to their implementation method

  • Does not seem to have learned from or reflected on their experiences with the project. A key sign of this is that the answer to 'what did you learn' and 'what would you do differently' are short and/or nearly identical.

Data Partition and Routing

· 2 min read

Why data partition and routing?

large dataset ⟶ scale out ⟶ data shard / partition ⟶ 1) routing for data access 2) replica for availability

  • Pros
    • availability
    • read(parallelization, single read efficiency)
  • Cons
    • consistency

How to do data partition and routing?

The routing abstract model is essentially just two maps: 1) key-partition map 2) partition-machine map

Hash partition

  1. hash and mod

    • (+) simple
    • (-) flexibility (tight coupling two maps: adding and removing nodes (partition-machine map) disrupt existing key-partition map)
  2. Virtual buckets: key--(hash)-->vBucket, vBucket--(table lookup)-->servers

    • Usercase: Membase a.k.a Couchbase, Riak
    • (+) flexibility, decoupling two maps
    • (-) centralized lookup table
  3. Consistent hashing and DHT

    • [Chord] implementation
    • virtual nodes: for load balance in heterogeneous data center
    • Usercase: Dynamo, Cassandra
    • (+) flexibility, hashing space decouples two maps. two maps use the same hash, but adding and removing nodes ==only impact succeeding nodes==.
    • (-) network complexity, hard to maintain

Range partition

sort by primary key, shard by range of primary key

range-server lookup table (e.g. HBase .META. table) + local tree-based index (e.g. LSM, B+)

(+) search for a range (-) log(n)

Usercase: Yahoo PNUTS, Azure, Bigtable

Where Does the Energy of a Good Strategy Come From?

· One min read

The key to an effective strategy lies in channeling limited energy into the points that can generate the most impact. Good resources should be used where they matter most.

So we must ask, where does this energy come from? Here are some common sources:

  • leverage
  • proximate objectives
  • chain-link systems. Systems without weaknesses are extremely difficult to replicate, such as IKEA.
  • design
  • focus strategy, providing specific services for particular groups
  • growth. The increase in team size should not be deliberate but rather a natural result of the company's product growth.
  • advantage. Comparative advantage is domain-specific; a champion runner may not excel in high jumping.
  • dynamics. Exploit a wave of exogenous change.
  • inertia and entropy.

The Core of a Good Strategy: Coherent Actions

· One min read
  • "Coherent and complementary actions" mean that these actions directly support each other to create synergy. For example, as a manager, the principle behind introducing processes imposed on others is, ==“I will never ask you to do anything that does not help you in your core job.”==
  • Strategic collaboration is not arranged on the spot; it is deliberately designed and centrally imposed on the system.
  • Centralization is a bad thing, but a completely fragmented approach is also ineffective, as the interests of various sub-organizations differ. For instance, in manufacturing, sales may prefer to please customers with expedited large orders, while the production department prefers to output steadily and undisturbed over the long term. It is impossible to handle both urgent large orders and maintain steady production simultaneously.
  • Organizational collaboration is time-consuming and labor-intensive; if there are not enough benefits, it should not be pursued. Smart organizations do not aim for 100% communication among everyone but rather strive for just the right amount of coordination.

The Core of a Good Strategy

· One min read

Three Fundamental Elements

  1. Diagnosis: Simplifying the problem and identifying challenges.
  2. Guiding Policies: How to respond to challenges?
  3. Coherent Actions: A series of actions that mutually reinforce each other under the guidance of principles.

Examples

In business, the challenge is usually dealing with change and competition.

  1. Diagnosing the specific structure of the challenge rather than simply naming performance goals.
  2. Choosing an overall guiding policy for dealing with the situation that builds on or creates some type of leverage or advantage.
  3. Designing a configuration of actions and resource allocations that implement the chosen guiding policy.

In many large organizations, the challenge is often diagnosed as internal.

  1. The organization’s competitive problems may be much lighter than the obstacles imposed by its own outdated routines, bureaucracy, pools of entrenched interests, lack of cooperation across units, and plain-old bad management.
  2. Reorganization and renewal.
  3. Changes in people, power, and procedures.

Amazon

The Amazon business model as drawn by Jeff Bezos on a napkin