Skip to main content

How Facebook Scale its Social Graph Store? TAO

What are the challenges?

Before TAO, use cache-aside pattern

Before TAO

Social graph data is stored in MySQL and cached in Memcached

3 problems:

  1. list update operation in Memcached is inefficient. cannot append but update the whole list.
  2. clients have to manage cache
  3. Hard to offer ==read-after-write consistency==

To solve those problems, we have 3 goals:

  • online data graph service that is efficiency at scale
  • optimize for read (its read-to-write ratio is 500:1)
    • low read latency
    • high read availability (eventual consistency)
  • timeliness of writes (read-after-write)

Data Model

  • Objects (e.g. user, location, comment) with unique IDs
  • Associations (e.g. tagged, like, author) between two IDs
  • Both have key-value data as well as a time field

Solutions: TAO

  1. Efficiency at scale and reduce read latency

  2. Write timeliness

    • write-through cache
    • follower/leader cache to solve thundering herd problem
    • async replication
  3. Read availability

    • Read Failover to alternate data sources

TAO's Architecture

  • MySQL databases → durability
  • Leader cache → coordinates writes to each object
  • Follower caches → serve reads but not writes. forward all writes to leader.

Facebook TAO Architecture

Read failover

Facebook TAO Read Failover

References: