Skip to main content

Designing Memcached or an in-memory KV store

· 2 min read

Requirements

  1. High-performance, distributed key-value store
  • Why distributed?
    • Answer: to hold a larger size of data
  1. For in-memory storage of small data objects
  2. Simple server (pushing complexity to the client) and hence reliable and easy to deploy

Architecture

Big Picture: Client-server

  • client
  • given a list of Memcached servers
  • chooses a server based on the key
  • server
  • store KVs into the internal hash table
  • LRU eviction

The Key-value server consists of a fixed-size hash table + single-threaded handler + coarse locking

hash table

How to handle collisions? Mostly three ways to resolve:

  1. Separate chaining: the collided bucket chains a list of entries with the same index, and you can always append the newly collided key-value pair to the list.
  2. open addressing: if there is a collision, go to the next index until finding an available bucket.
  3. dynamic resizing: resize the hash table and allocate more spaces; hence, collisions will happen less frequently.

How does the client determine which server to query?

See Data Partition and Routing

How to use cache?

See Key value cache

How to further optimize?

See How Facebook Scale its Social Graph Store? TAO

Lyft's Marketing Automation Platform Symphony

· 3 min read

Customer Acquisition Efficiency Issue: How can advertising campaigns achieve higher returns with less money and fewer people?

Specifically, Lyft's advertising campaigns need to address the following characteristics:

  1. Manage location-based campaigns
  2. Data-driven growth: growth must be scalable, measurable, and predictable
  3. Support Lyft's unique growth model, as shown below:

lyft growth model

The main challenge is the difficulty of scaling management across various aspects of regional marketing, including ad bidding, budgeting, creative assets, incentives, audience selection, testing, and more. The following image depicts a day in the life of a marketer:

A Day in the Life of a Marketer

We can see that "execution" takes up most of the time, while less time is spent on the more important tasks of "analysis and decision-making." Scaling means reducing complex operations and allowing marketers to focus on analysis and decision-making.

Solution: Automation

To reduce costs and improve the efficiency of experimentation, it is necessary to:

  1. Predict whether new users are interested in the product
  2. Optimize across multiple channels and effectively evaluate and allocate budgets
  3. Conveniently manage thousands of campaigns

Data is enhanced through Lyft's Amundsen system using reinforcement learning.

The automation components include:

  1. Updating bid keywords
  2. Disabling underperforming creative assets
  3. Adjusting referral values based on market changes
  4. Identifying high-value user segments
  5. Sharing strategies across multiple campaigns

Architecture

Lyft Symphony Architecture

Technology stack: Apache Hive, Presto, ML platform, Airflow, 3rd-party APIs, UI.

Specific Component Modules

LTV Prediction Module

The lifetime value (LTV) of users is an important metric for evaluating channels, and the budget is determined by both LTV and the price we are willing to pay for customer acquisition in that region.

Our understanding of new users is limited, but as interactions increase, the historical data provided will more accurately predict outcomes.

Initial feature values:

Feature Values

As historical interaction records accumulate, the predictions become more accurate:

Predicting LTV Based on Historical Records

Budget Allocation Module

Once LTV is established, the next step is to set the budget based on pricing. A curve of the form LTV = a * (spend)^b is fitted, along with similar parameter curves in the surrounding range. Achieving a global optimum requires some randomness.

Budget Calculation

Delivery Module

This module is divided into two parts: the parameter tuner and the executor. The tuner sets specific parameters based on pricing for each channel, while the executor applies these parameters to the respective channels.

There are many popular delivery strategies that are common across various channels:

Delivery Strategies

Conclusion

It is essential to recognize the importance of human experience within the system; otherwise, it results in garbage in, garbage out. When people are liberated from tedious delivery tasks and can focus on understanding users, channels, and the messages they need to convey to their audience, they can achieve better campaign results—spending less time to achieve higher ROI.

Why Side Projects Should Be Stupid

· One min read

The design lead at Spotify started various side projects after dropping out of school at 15. With decades of experience, he believes that successful side projects must be "stupid," allowing for bolder and more enjoyable exploration without burdens. All major endeavors start as small ones; 50% of Google's new projects originate from their famous 20% side projects. Complex issues like scaling, funding, and competitive analysis can hinder your progress. Taking action is more important than learning because others won't be smarter than you, and it's hard to empathize with others' experiences. Many successful startups are not planned. Projects attract like-minded individuals.

Another psychological benefit of side projects is that they are "fun." You are your own boss, and you can decide what to do without anyone stopping you. :)

How to write solid code?

· One min read

he likes it

  1. empathy / perspective-taking is the most important.

    1. realize that code is written for human to read first and then for machines to execute.
    2. software is so "soft" and there are many ways to achieve one thing. It's all about making the proper trade-offs to fulfill the requirements.
    3. Invent and Simplify: Apple Pay RFID vs. Wechat Scan QR Code.
  2. choose a sustainable architecture to reduce human resources costs per feature.

  1. adopt patterns and best practices.

  2. avoid anti-patterns

    • missing error-handling
    • callback hell = spaghetti code + unpredictable error handling
    • over-long inheritance chain
    • circular dependency
    • over-complicated code
      • nested tertiary operation
      • comment out unused code
    • missing i18n, especially RTL issues
    • don't repeat yourself
      • simple copy-and-paste
      • unreasonable comments
  3. effective refactoring

    • semantic version
    • never introduce breaking change to non major versions
      • two legged change

Metrics for Measuring a Product

· 2 min read

Metrics Turn Vague Ideas into Great Ideas

  • A rough estimate is sufficient
  • Focus on how to evaluate a great product
  • Note, do not worry about the overall market size
    • Simply asking "Will it ultimately be large?" is enough

What Should We Measure About the Product?

  • Are users receiving value?
    • Active users
    • Profit
    • Transaction volume
    • Retention

Retention is Important

The chart below shows the retention rate over time for each annual cohort.

If the cohort retention rate keeps declining without a lower limit, your growth will be like a fire ring in a field that cannot sustain itself because

  • Old users will be burned out
  • New users will become fewer

You should keep the retention rate above a certain line.

Conversely, if the cohort retention rate increases over time, it indicates that your product is excellent. For example:

  1. Whatsapp
  2. Uber
  3. Facebook

Tom Tunguz: High Retention Means High Valuation

Growth

If you have high expectations and want to achieve exponential growth, then assess your weekly growth rate. Strive to keep this weekly growth rate steady, and you will experience exponential growth.

Iteration Speed

  • Smaller companies have an advantage in iteration speed
  • Metrics are key to guiding the direction of iterations
  • Measure the speed of the iterations themselves

How Much Should You Value Metrics?

  1. Not enough emphasis
    • Easy to deceive oneself and hit a wall
  2. Very high emphasis
    • Your decisions will be cautious
  3. Overemphasis
    • You increase numbers but do not add value

Reality Can Be Painful

  • Metrics can shatter illusions, and sometimes it can be a bit painful
  • Most leaders attract subordinates through illusions
  • You should
    • Learn from the past
    • Clarify the present
    • Smile at the future

Why Smart People Have Bad Ideas

· 3 min read

Paul Graham and Robert Morris's first startup around the age of 30, which involved putting art exhibitions online, ultimately turned out to be a bad idea, let alone for those in their 20s.

Why do smart people come up with bad business ideas?

  1. Bad ideas are products of intuition.
    1. People often think of what to do based on what they see; however, if you plan to work on something for several years, you should weigh several different ideas beforehand.
    2. The most tragic part is the subsequent IKEA effect: despite being a bad idea, because you invested effort into it, you will grow fond of the idea.
    3. What to do? Getting started quickly is good, but you must recognize that ==spending time does not necessarily improve the situation.== Keep asking yourself: Would anyone be willing to pay for it?
  2. Bad ideas prioritize being impressive over making money.
    1. If you want to make money, don’t shy away from dirty work: the dirtier and more laborious the work, the fewer people are willing to do it, leading to a tighter supply-demand relationship; if you take it on, your bargaining power increases.
    2. Since making money is hard, your primary goal should be to make money; otherwise, you won't make any.
  3. Bad ideas yield to fear of competition.
    1. Competition may not be as fierce as the media portrays; they often lack technical knowledge and coding skills.
    2. Why do we fear? Understanding programming but not business? In reality, the term "business" is too broad and abstract, obscuring the concrete issues below: selling products, promotions, understanding what customers want, pricing, customer support, billing, etc. Once you tackle these details, you have resolved "business."
  4. Bad ideas are too superficial.
    1. Most ideas are hybrids of blogs, calendars, dating sites, and social networks. Meanwhile, there are better, unsolved problems in the distance with high demand, yet no one is addressing them.
    2. Why is it difficult for people to research what customers truly want? The existing education system teaches people how to solve problems but fails to teach them how to ask questions.

Understanding what users want is challenging, as it requires recognizing that you need to invest time and effort into this task. Unsure how to start? The answer lies in Carnegie's classic "How to Win Friends and Influence People" — the answer is empathy, or putting yourself in others' shoes.

Fortunately, the ability to conduct user research can be learned; "smart creators" and "user research" are a perfect match that can unleash endless potential.

In a nutshell: Create products and services that the public loves and enjoys.

Introduction to Beancount.io

· 7 min read

Why bookkeeping?

Everyone has advice about how to manage money. Search Google for "manage money," and you'll get back over 1,690,000,000 links. You'll find tons of life-hacking or self-help articles and books. You'll find professional coaches or courses witch will coach you for a fee. You'll find financial and investment services. Feel free to try what appeals to you and grow your assets through trials and errors.

I think the most important thing to remember is that asking the question of money comes from fear and self-doubt. We all fear change. We all doubt our ability to make more money.

Instead of spending time worrying and doubting, focusing the opposite — your confidence. If you are playing poker and with few chips, you can only make small bets and only win a small amount of money. When you have a lot of chips, you can make big bets and win big. You have more room for taking risks. You can try things which you cannot try when you have fewer chips.

Here is the magic - by understanding more of your financial status, you gain confidence! With more confidence, we can make better judgment and would like to bet the best amount for more significant success, and then win more.

Expenses

Know your expenses and plan for next spending

Where is the end of the wining-more concern? People often talk about the buzzword of financial freedom. However, talk is cheap, but bookkeeping precisely answers the question.

Four main financial statements for the overview of financial status

Unfortunately, bookkeeping is not easy in our modern life. We are in a new age of abundance. We have a lot of accounts - cash, bank accounts, payment apps, credit cards, stock or crypto broker accounts, discount cards, … We have assets like houses, cars, gold, jewelry, … To make things even worse, some of us may live across countries and have to deal with different currencies. How could we draw an accurate map of our financial life and navigate through the future uncertainties?

By "accurate map of our financial life," I mean these four primary financial statements:

  1. Income Statement: It shows how much revenue we earned over a specific period. This statement is usually considered the most important of the financial statements because it reflects the operating results.
  2. Balance Sheets: It answers how much assets, liabilities, and equity of the entity we have. This statement is the second most important because it reports the liquidity and capitalization of our assets.
  3. Cash Flow Statements: It reports our inflows and outflows of cash and answers whether we generated cash. We need enough cash on hand to pay expenses and purchase assets.
  4. Equity Statement: This is not helpful for your personal accounting. However, for a company, this statement reports how it distributes equity among stakeholders.

Income Statement

Balance Sheet

With beancount.io, you can quickly generate statements like the above. But wait… How to prepare data for these statements?

Double-entry Bookkeeping for Correctness

To ensure the accuracy and internalize the error detection into the system, double-entry bookkeeping requires every entry to an account has at-least a corresponding entry to a different account. One transaction involves at least two accounts with two operations - debit (+) and credit (-).

1970-01-01 open Income:BeancountCorp
1970-01-01 open Assets:Cash
1970-01-01 open Expenses:Food
1970-01-01 open Assets:Receivables:Alice
1970-01-01 open Assets:Receivables:Bob
1970-01-01 open Assets:Receivables:Charlie
1970-01-01 open Liabilities:CreditCard

2019-05-31 * "BeancountCorp" "Salary of May 15th to May 31st"
Income:BeancountCorp -888 USD
Assets:Cash 888 USD

2019-07-12 * "Popeyes chicken sandwiches" "dinner with Alice, Bob, and Charlie"
Expenses:Food 20 USD
Assets:Receivables:Alice 20 USD
Assets:Receivables:Bob 20 USD
Assets:Receivables:Charlie 20 USD
Liabilities:CreditCard -80 USD

As you can see in the two examples above, every transaction must fulfill the accounting equation.

Assets = Liabilities + Equity(aka Net Assets)

We used the Beancount syntax by Martin Blais and the web project Fava by Jakob Schnitzer to build this website. And it will alert you if any transaction has any legs not summing to zero.

Error Alert

Now you understand how we enforce the correctness of the ledger. But you may ask what are those "accounts"?

Accounts for money as buckets for water

Thinking your assets as water running in and out of different buckets and "accounts" are those buckets holding your money. With double-entry bookkeeping, it becomes obvious how money is flowing across different accounts, just like how water is flowing across different buckets.

Beancount.io introduces five kinds of accounts.

  1. Income — Its amount is always negative or in debit. This is because you are making money, and then the money is debiting from "Income" account and crediting to your "Assets."
  2. Expenses — Its amount is always positive or in credit. This is because you are spending money, and the money is flowing from the "Assets" or "Liabilities" to the "Expenses."
  3. Liabilities — Its amount is positive or zero. Your credit card liabilities are a good example, which rises and falls in cycles.
  4. Assets — Its amount is positive or zero. Your cash or houses are always worthing some prices.
  5. Equity — Your net assets. The system will calculate automatically for you. Equity = Assets - Liabilities and it reflects how wealthy you are.

Now you can open your customized accounts with those keywords above:

1970-01-01 open Assets:Cash
1970-01-01 open Assets:Stock:Robinhood
1970-01-01 open Assets:Crypto:Coinbase
1970-01-01 open Expenses:Transportation:Taxi
1970-01-01 open Equity:OpeningBalance

Commodities: Tracking your investment

Yes, you can track your investment with beancount.io. For example, we buy 10 Bitcoins at the price of $100 in 2014:

2014-08-08 * "Buy 10 Bitcoin"
Assets:Trade:Cash -1000.00 USD
Assets:Trade:Positions 10 BTC {100.00 USD}

And then three years later, you sell them (originally with costs of 100perunitannotatedwith100.00USD)atthepriceof100 per unit annotated with `{100.00 USD}`) at the price of **10,000 per unit** annotated with @ 10,000.00 USD.

2017-12-12 * "Sell 2 Bitcoin"
Assets:Trade:Positions -2 BTC {100.00 USD} @ 10,000.00 USD
Assets:Trade:Cash 20,000.00 USD
Income:Trade:PnL -19,800.00 USD

Or the same transaction with @@ 20,000.00 USD means that at the price of $20,000 in total.

2017-12-12 * "Sell 2 Bitcoin"
Assets:Trade:Positions -2 BTC {100.00 USD} @@ 20,000.00 USD
Assets:Trade:Cash 20,000.00 USD
Income:Trade:PnL -19,800.00 USD

The sum of all legs of the transaction, including -2 BTC {100.00 USD}, are still, as always, zero.

The costs {100.00 USD} tag is important because you might have bought the same commodity at different costs.

100 BTC {10.00 USD, 2012-08-08}
10 BTC {100.00 USD, 2014-08-08}

If you want to simplify the process, you can set up the account at the beginning with FIFO or LIFO. FIFO stands for first in, first out, while LIFO stands for last in, first out. In the US, IRS uses FIFO to calculate your PnL and tax accordingly.

1970-01-01 open Assets:Trade:Positions "FIFO"

And then when you sell it in shorthand like -2 BTC {}, beancount will apply FIFO strategy automatically and sell the oldest commodity.

Beancount.io

Beancount.io is such a cloud service for recording your financial transactions in text files, visualize them into financial statements (income statement, balance sheet, trial balance, etc.), and helps you live a better financial life. Sign up now - It's in Promotional Period and Free!

The Company's Technology and Market Quadrant Diagram and Gravitational Directions

· 4 min read

Technology

Technology

Market

Market

Leaders

Leaders

The Innovator's Dilemma

The Innovator's Dilemma

Challengers

Challengers

Dragonslayers

Dragonslayers

  • The gravitational direction of technology is towards mediocrity: as technology inflates, excellent technologies tend to become mediocre, and mediocre companies adopt excellent technologies.
  • The gravitational direction of the market is the Matthew effect: markets with a presence will grow larger, while those without will shrink.

Designing a metric system

· 13 min read

Requirements

Log v.s Metric: A log is an event that happened, and a metric is a measurement of the health of a system.

We are assuming that this system’s purpose is to serve metrics - namely, counters, conversion rate, timers, etc. for monitoring the system performance and health. If the conversion rate drops drastically, the system should alert the on-call.

  1. Monitoring business metrics like signup funnel’s conversion rate
  2. Supporting various queries, like on different platforms (IE/Chrome/Safari, iOS/Android/Desktop, etc.)
  3. data visualization
  4. Scalability and Availability

Architecture

Two ways to build the system:

  1. Push Model: Influx/Telegraf/Grafana
  2. Pull Model: Prometheus/Grafana

The pull model is more scalable because it decreases the number of requests going into the metrics databases - there is no hot path and concurrency issue.

Server Farm

Server Farm

write

write

telegraf

telegraf

InfluxDB

InfluxDB

REST API

REST API

Grafana

Grafana

InfluxDB Push Model

InfluxDB Push Model

Prometheus Pull Model

Prometheus Pull Model

Application

Application

Exporter

Exporter

client library

client library

3rd Party


Application

3rd Party<br>Application

pull

pull

Prometheus

Prometheus

Retrieval

Retrieval

Service Discovery

Service Discovery

Storage

Storage

PromQL

PromQL

Alertmanager

Alertmanager

Web UI / Grafana / API Clients

Web UI / Grafana / API Clients

PagerDuty

PagerDuty

Email

Email

Features and Components

Measuring Sign-up Funnel

Take a four-step sign up on the mobile app for example

INPUT_PHONE_NUMBER -> VERIFY_SMS_CODE -> INPUT_NAME -> INPUT_PASSWORD

Every step has IMPRESSION and POST_VERIFICATION phases. And emit metrics like this:

{
"sign_up_session_id": "uuid",
"step": "VERIFY_SMS_CODE",
"os": "iOS",
"phase": "POST_VERIFICATION",
"status": "SUCCESS",
// ... ts, contexts, ...
}

Consequently, we can query the overall conversion rate of VERIFY_SMS_CODE step on iOS like

(counts of step=VERIFY_SMS_CODE, os=iOS, status: SUCCESS, phase: POST_VERIFICATION) / (counts of step=VERIFY_SMS_CODE, os=iOS, phase: IMPRESSION)

Data Visualization

Graphana is mature enough for the data visualization work. If you do not want to expose the whole site, you can use Embed Panel with iframe.

Designing Square Cash or PayPal Money Transfer System

· 19 min read

Clarifying Requirements

Designing a service money transfer backend system like Square Cash (we will call this system Cash App below) or PayPal to

  1. Deposit from and payout to bank
  2. Transfer between accounts
  3. High scalability and availability
  4. i18n: language, timezone, currency exchange
  5. Deduplication for non-idempotent APIs and for at-least-once delivery.
  6. Consistency across multiple data sources.

Architecture

AWS CloudHSM

AWS CloudHSM

Presentation Layer

Presentation Layer

SDK/Docs

SDK/Docs

mobile-dashboard

mobile-dashboard

web-dashboard

web-dashboard

dashboard-client

dashboard-client

mobile-wallet

mobile-wallet

web-wallet

web-wallet

wallet-client

wallet-client

Merchant 


User

Merchant <br>User

End User

End User

web-chrome-extension

web-chrome-extension

Operators

Operators

payment

payment

task-queue

task-queue

financial-reporter

financial-reporter

payment-gateway

payment-gateway

banks / 


vendors

[Not supported by viewer]

side-effect maker

side-effect maker

help service portal

help service portal

User


Profiles


AuthDB


[Not supported by viewer]

api-gateway


monolithic


api-gateway<br>monolithic<br>

Payment


DB


Payment<br>DB<br>

Aurora

Aurora

risk control

risk control

risk control

risk control

Event
Queue

[Not supported by viewer]

Features and Components

Payment Service

The payment data model is essentially “double-entry bookkeeping”. Every entry to an account requires a corresponding and opposite entry to a different account. Sum of all debit and credit equals to zero.

Deposit and Payout

Transaction: new user Jane Doe deposits $100 from bank to Cash App. This one transaction involves those DB entries:

bookkeeping table (for history)

+ debit, USD, 100, CashAppAccountNumber, txId
- credit, USD, 100, RoutingNumber:AccountNumber, txId

transaction table

txId, timestamp, status(pending/confirmed), [bookkeeping entries], narration

Once the bank confirmed the transaction, update the pending status above and the following balance sheet in one transaction.

balance sheet

CashAppAccountNumber, USD, 100

Transfer between accounts within Cash App

Similar to the case above, but there is no pending state because we do not need the slow external system to change their state. All changes in bookkeeping table, transaction table, and balance sheet table happen in one transaction.

i18n

We solve the i18n problems in 3 dimensions.

  1. Language: All texts like copywriting, push notifications, emails are picked up according to the accept-language header.
  2. Timezones: All server timezones are in UTC. We transform timestamps to the local timezone in the client-side.
  3. Currency: All user transferring transactions must be in the same currency. If they want to move across currencies, they have to exchange the currency first, in a rate that is favorable to the Cash App.

For example, Jane Doe wants to exchange 1 USD with 6.8 CNY with 0.2

bookkeeping table

- credit, USD, 1, CashAppAccountNumber, txId
+ debit, CNY, 6.8, CashAppAccountNumber, txId, @7.55 CNY/USD
+ debit, USD, 0.1, ExpensesOfExchangeAccountNumber, txId

Transaction table, balance sheet, etc. are similar to the transaction discussed in Deposit and Payout. The major difference is that the bank or the vendor provides the exchange service.

How to sync across the transaction table and external banks and vendors?

  • retry with idempotency to improve the success rate of the external calls and ensure no duplicate orders.
  • two ways to check if the PENDING orders are filled or failed.
    1. poll: cronjobs (SWF, Airflow, Cadence, etc.) to poll the status for PENDING orders.
    2. callback: provide a callback API for the external vendors.
  • Graceful shutdown. The bank gateway calls may take tens of seconds to finish, and restarting the servers may resume unfinished transactions from the database. The process may create too many connections. To reduce connections, before the shutdown, stop accepting new requests and wait for the existing outgoing ones to wrap up.

Deduplication

Why is Deduplication a concern?

  1. not all endpoints are idempotent
  2. Event queue may be at-least-once.

not all endpoints are idempotent: what if the external system is not idempotent?

For the poll case above, if the external gateway does not support idempotent APIs, in order not to flood with duplicate entries, we must keep record of the order ID or the reference ID the external system gives us with 200, and query GET by the order ID instead of POST all the time.

For the callback case, we can ensure we implement with idempotent APIs, and we mutate pending to confirmed anyway.

Event queue may be at-least-once

  • For the even queue, we can use an exactly-once Kafka with the producer throughput declines only by 3%.
  • In the database layer, we can use idempotency key or deduplication key.
  • In the service layer, we can use Redis key-value store.

Availability and Scalability