Skip to main content

How Does Facebook Store a Large-Scale Social Graph? TAO

· 2 min read

What Are the Challenges?

Before TAO, using the cache-aside pattern

Before TAO

The social graph is stored in MySQL and cached in Memcached.

Three existing problems:

  1. The efficiency of updating the edge list of the social graph in Memcached is too low. Instead of adding an edge to the end of the list, the entire list needs to be updated.
  2. The logic for managing the cache on the client side is very complex.
  3. It is difficult to maintain ==consistency in database reads after writes==.

To solve these problems, we have three goals:

  • Efficient graph storage even with large-scale data.
  • Optimize read operations (read-write ratio is 500:1).
    • Reduce the duration of read operations.
    • Improve the availability of read operations (eventual consistency).
  • Complete write operations in a timely manner (write first, read later).

Data Model

  • Objects with unique IDs (e.g., users, addresses, comments).
  • Associations between two IDs (e.g., tagged, liked, posted).
  • Both of the above data models have key-value data and time-related data.

Solution: TAO

  1. Accelerate read operations and efficiently handle large-scale reads.

    • Cache specifically for graphs.
    • Add a layer of cache between the stateless server layer and the database layer (see Business Splitting).
    • Split data centers (see Data Partitioning).
  2. Complete write operations in a timely manner.

    • Write-through cache.
    • Use follower/leader caching to solve the ==thundering herd problem==.
    • Asynchronous replication.
  3. Improve the availability of read operations.

    • If a read fails, read from other available sources.

Architecture of TAO

  • MySQL Database → Durability.
  • Leader Cache → Coordinates write operations for each object.
  • Follower Cache → Used for reading rather than writing. Shift all write operations to the leader cache.

Architecture of Facebook TAO

Fault tolerance for read operations.

Fault Tolerance for Read Operations in Facebook TAO

Why Startups Have to Innovate?

· One min read

Why do startups have to innovate? Why do growth techniques work only once for a given product or service in a given market? Why is there no skill called “business”?

The answer is ==Anna Karenina principle==. Tolstoy opens Anna Karenina by observing: “All happy families are alike; each unhappy family is unhappy in its own way.” Business is the opposite. ==All happy companies are different: each one earns a monopoly by solving a unique problem. All failed companies are the same: they failed to escape competition.==

If a startup does not innovate but copy a product or service from the market leader, and the startup is targeting the same market, then people will not buy it because people are probably customers of the market leader already. Why do people buy the same thing for twice if their needs are fulfilled already?

Why Startups Need to Innovate?

· One min read

Why do startups need to innovate? Why do growth techniques only work for specific products or services in a particular market? Why is there no such thing as a "business technology"?

The answer lies in the ==Anna Karenina principle==. Tolstoy observed that "happy families are all alike; every unhappy family is unhappy in its own way," leading to this theory. However, business is the opposite. ==All successful companies are different: each successful company gains a monopoly in a field by solving a specific problem. All failed companies are the same: they did not escape market competition.==

If a startup simply copies the products or services of current industry leaders, lacks innovation, and targets the same market, people will not buy into it, as they may already be customers of the industry leader. If people's needs are already met by existing products, why would they buy the same thing twice?

What is a Market?

· One min read

For high tech, we can define a market as

  1. a set of actual or potential customers
  2. for a given set of products or services
  3. who have a common set of needs or wants, and
  4. ==who reference each other when making a buying decision.==

Point 4 is the insight here - referencing each other is key to the marketing success. If two people buy the same product for the same reason but have no way they could reference each other, they are not part of the same market. They are in different ==market segments==.

What is the Market?

· One min read

The definition of a high-tech market is

  1. Existing and potential users
  2. Who have a demand for a certain type of product or service
  3. And
  4. These individuals reference each other when deciding on the products to purchase

The understanding of the fourth point is - mutual referencing is the key to market success. If two people buy the same product for the same reason, but they have no way to reference each other, they are not in the same market. They are in different market segments.

What is the chasm in the technology adoption lifecycle?

· 3 min read

Is the innovation disruptive?

Dsruptive innovation vs. Continuous innovation

  • Whether it ==changes our current mode of behavior== or to ==modify other products and services we rely on==.
  • Between continuous and discontinuous lies a spectrum of demands for behavioral change.

High-tech industries introduce disruptive innovation routinely, during which people are converted into customers by following a pattern of normal distribution. The product's user growth follows an S-curve.

When will people buy a high-tech product?

Technology adoption lifecyle

Disruptive innovation's customers are converted at different stages in ==the technology adoption life cycle==. They are...

  1. Innovators
  2. Early adopters
  3. Early majority (pragmatists)
  4. Late majority (conservatives)
  5. Laggards
SegmentWhat They Want
Innovatorsnovel, cool and experimental things
Early Adoptersgaining advantages or getting products before others
Early Majorityproven ROI, instant access, low transition costs, support available
Late Majorityadopting as minimal as possible or only when everyone else has adopted
Laggardsavoidance to adopt new things

What is the high tech marketing model?

This cycle provides guidance of the ==high tech marketing model: the way to develop a high-tech market is to work the curve left to right, focusing on each group one by one,== because groups on the left promote products for the right ones in a momentum.

Momentum is vital because it can

  1. save costs
  2. make it fast so you won’t miss the window of opportunity before next disruption or competitor

Where is the CHASM?

Inspecting into the technology adoption lifecycle, we can see Crossing the Chasm

  • two cracks

    1. Beneficial usage crack between innovators and early adopters. E.g., Esperanto, VRML, second life, 3D printing. To cross this, we need a flagship application.
    2. Competent majority crack, between early and late majorities. E.g., home automation.scanning and project management software. To cross this, we need to make it easier to adopt.
  • and one CHASM

    1. Early adopter-to-majority chasm. Because their needs are different

      1. Early adopter is buying a change agent - they expect to get a jump on the competition. Having bugs is fine.
      2. The pragmatic early majority is buying a productivity improvement. They want technology to ==enhance, not overthrow, the established ways of doing business==.
    2. The compatibility above leads to two key points

      1. early adopters do not make good references for the early majority
      2. And because of the early majority’s concern not to disrupt their organizations, good references are critical to their buying decisions.
    3. Who did fall into the early adopter-to-majority chasm in 2014? E.g., holograms, pen-based tablets, fuel cells, QR codes (in the US), Massive Open Online Courses, Segways, Motorola iridium.

What is the Chasm in the Technology Adoption Lifecycle?

· 3 min read

Is Innovation Disruptive?

Disruptive Innovation vs. Continuous Innovation

  • Does it ==change our current behavior patterns== or ==alter the other products and services we rely on==?
  • There exists a range of demands for behavioral change between continuous and discontinuous innovation.

High-tech companies frequently introduce disruptive innovations. During these disruptive innovations, people become users of these products. This transition typically occurs in a normal distribution, resulting in an S-curve for user growth.

When Do People Buy High-Tech Products?

Technology Adoption Lifecycle

In the ==technology adoption lifecycle==, people become users of disruptive innovation products at different stages. They are:

  1. Innovators
  2. Early Adopters
  3. Early Majority (Pragmatists)
  4. Late Majority (Conservatives)
  5. Laggards

What is the High-Tech Market Model?

This cycle provides guidance for the ==high-tech market model==, ==showing how to develop a high-tech market by smoothly transitioning this cycle from left to right, breaking through one user group at a time.== Leveraging the momentum of the left-side user group makes it easier to market products to the right-side user group.

User momentum is crucial because it can:

  1. Save costs
  2. Accelerate the cycle if you don't want to miss opportunities in the next disruptive innovation or against competitors.

Where is the Chasm?

Observing the technology adoption lifecycle, we can see Crossing the Chasm

  • Two chasms

    1. There is a chasm between Innovators and Early Adopters, which is the appropriate application scenario. For example, Esperanto, VRML, Second Life, 3D printing. To cross this chasm, we need a flagship product.
    2. There is another chasm between the Early Majority and the Late Majority, which is competitive products. For example, home automation, scanning, and project management software. To cross this chasm, we need to make the product more acceptable.
  • And a gap

    1. The gap from Early Adopters to the Early Majority. Their needs are different:

      1. Early Adopters want a change -- they expect to achieve a leap in a competitive market. Minor issues with the product are acceptable.
      2. The majority of Early Majority users, who are pragmatists, want increased productivity. They want technology to ==enhance, rather than disrupt their fundamental ways of working==.
    2. Incompatibility between the first two stages and the last two stages

      1. The behavior of Early Adopters is not a good reference for the majority of Early Majority users.
      2. Furthermore, the majority of Early Majority users do not want to disrupt their existing organizational structures. A good reference is crucial for their decision on whether to use this product.
    3. Who really encountered this gap in 2014? For example, holograms, pen-based tablets, fuel cells, QR codes (in the U.S.), most online courses, Segways, Motorola Iridium.

How Netflix Serves Viewing Data?

· 2 min read

Motivation

How to keep users' viewing data in scale (billions of events per day)?

Here, viewing data means...

  1. viewing history. What titles have I watched?
  2. viewing progress. Where did I leave off in a given title?
  3. on-going viewers. What else is being watched on my account right now?

Architecture

Netflix Viewing Data Architecture

The viewing service has two tiers:

  1. stateful tier = active views stored in memory

    • Why? to support the highest volume read/write
    • How to scale out?
      • partitioned into N stateful nodes by account_id mod N
        • One problem is that load is not evenly distributed and hence the system is subject to hot spots
      • CP over AP in CAP theorem, and there is no replica of active states.
        • One failed node will impact 1/nth of the members. So they use stale data to degrade gracefully.
  2. stateless tier = data persistence = Cassandra + Memcached

    • Use Cassandra for very high volume, low latency writes.
      • Data is evenly distributed. No hot spots because of consistent hashing with virtual nodes to partition the data.
    • Use Memcached for very high volume, low latency reads.
      • How to update the cache?
        • after writing to Cassandra, write the updated data back to Memcached
        • eventually consistent to handling multiple writers with a short cache entry TTL and a periodic cache refresh.
      • in the future, prefer Redis' appending operation to a time-ordered list over "read-modify-writes" in Memcached.

How to design robust and predictable APIs with idempotency?

· 2 min read

How could APIs be un-robust and un-predictable?

  1. Networks are unreliable.
  2. Servers are more reliable but may still fail.

How to solve the problem? 3 Principles:

  1. Client retries to ensure consistency.

  2. Retry with idempotency and idempotency keys to allow clients to pass a unique value.

    1. In RESTful APIs, the PUT and DELETE verbs are idempotent.
    2. However, POST may cause ==“double-charge” problem in payment==. So we use a ==idempotency key== to identify the request.
      1. If the failure happens before the server, then there is a retry, and the server will see it for the first time, and process it normally.
      2. If the failure happens in the server, then ACID database will guarantee the transaction by the idempotency key.
      3. If the failure happens after the server’s reply, then client retries, and the server simply replies with a cached result of the successful operation.
  3. Retry with ==exponential backoff and random jitter==. Be considerate of the ==thundering herd problem== that servers that may be stuck in a degraded state and a burst of retries may further hurt the system.

For example, Stripe’s client retry calculates the delay like this...

def self.sleep_time(retry_count)
# Apply exponential backoff with initial_network_retry_delay on the
# number of attempts so far as inputs. Do not allow the number to exceed
# max_network_retry_delay.
sleep_seconds = [Stripe.initial_network_retry_delay * (2 ** (retry_count - 1)), Stripe.max_network_retry_delay].min

# Apply some jitter by randomizing the value in the range of (sleep_seconds
# / 2) to (sleep_seconds).
sleep_seconds = sleep_seconds * (0.5 * (1 + rand()))

# But never sleep less than the base sleep seconds.
sleep_seconds = [Stripe.initial_network_retry_delay, sleep_seconds].max

sleep_seconds
end

How to Design Robust and Predictable APIs with Idempotency?

· 2 min read

Why are APIs unreliable?

  1. Networks can fail.
  2. Servers can fail.

How can we solve this problem? Three principles:

  1. The client uses "retry" to ensure state consistency.

  2. The retry requests must include an ==idempotent unique ID==.

    1. In RESTful API design, the semantics of PUT and DELETE are inherently idempotent.
    2. However, POST in online payment scenarios may lead to the ==“duplicate payment” issue==, so we use an "idempotent unique ID" to identify whether a request has been sent multiple times.
      1. If the error occurs before reaching the server, after retrying, the server sees it for the first time and processes it normally.
      2. If the error occurs on the server, based on this "unique ID," an ACID-compliant database ensures that this transaction occurs only once.
      3. If the error occurs after the server returns a result, after retrying, the server only needs to return the cached successful result.
  3. Retries must be responsible, such as following the ==exponential backoff algorithm==, because we do not want a large number of clients to retry simultaneously.

For example, Stripe's client calculates the wait time for retries like this:

def self.sleep_time(retry_count)
# Apply exponential backoff with initial_network_retry_delay on the
# number of attempts so far as inputs. Do not allow the number to exceed
# max_network_retry_delay.
sleep_seconds = [Stripe.initial_network_retry_delay * (2 ** (retry_count - 1)), Stripe.max_network_retry_delay].min

# Apply some jitter by randomizing the value in the range of (sleep_seconds
# / 2) to (sleep_seconds).
sleep_seconds = sleep_seconds * (0.5 * (1 + rand()))

# But never sleep less than the base sleep seconds.
sleep_seconds = [Stripe.initial_network_retry_delay, sleep_seconds].max

sleep_seconds
end