Skip to main content

49 posts tagged with "system-design"

View all tags

How to Design the Architecture of a Blockchain Server?

· 7 min read

Requirement Analysis

  • A distributed blockchain accounting and smart contract system
  • Nodes have minimal trust in each other but need to be incentivized to cooperate
    • Transactions are irreversible
    • Do not rely on trusted third parties
    • Protect privacy, disclose minimal information
    • Do not rely on centralized authority to prove that money cannot be spent twice
  • Assuming performance is not an issue, we will not consider how to optimize performance

Architecture Design

Specific Modules and Their Interactions

Base Layer (P2P Network, Cryptographic Algorithms, Storage)

P2P Network

There are two ways to implement distributed systems:

  • Centralized lead/follower distributed systems, such as Hadoop and Zookeeper, which have a simpler structure but high requirements for the lead
  • Decentralized peer-to-peer (P2P) network distributed systems, such as those organized by Chord, CAN, Pastry, and Tapestry algorithms, which have a more complex structure but are more egalitarian

Given the premise that nodes have minimal trust in each other, we choose the P2P form. How do we specifically organize the P2P network? A typical decentralized node and network maintain connections as follows:

  1. Based on the IP protocol, nodes come online occupying a certain address hostname/port, broadcasting their address using an initialized node list, and trying to flood their information across the network using these initial hops.
  2. The initial nodes receiving the broadcast save this neighbor and help with flooding; non-adjacent nodes, upon receiving it, use NAT to traverse walls and add neighbors.
  3. Nodes engage in anti-entropy by randomly sending heartbeat messages containing the latest information similar to vector clocks, ensuring they can continuously update each other with their latest information.

We can use existing libraries, such as libp2p, to implement the network module. For the choice of network protocols, see Crack the System Design Interview: Communication.

Cryptographic Algorithms

In a distributed system with minimal trust, how can a transfer be proven to be initiated by oneself without leaking secret information? Asymmetric encryption: a pair of public and private keys corresponds to "ownership." Bitcoin chooses the secp256k1 parameters of the ECDSA elliptic curve cryptographic algorithm, and for compatibility, other chains also generally choose the same algorithm.

Why not directly use the public key as the address for the transfer? Privacy concerns; the transaction process should disclose as little information as possible. Using the hash of the public key as the "address" can prevent the recipient from leaking the public key. Furthermore, people should avoid reusing the same address.

Regarding account ledgers, there are two implementation methods: UTXO vs. Account/Balance

  • UTXO (unspent transaction output), such as Bitcoin, resembles double-entry bookkeeping with credits and debits. Each transaction has inputs and outputs, but every input is linked to the previous output except for the coinbase. Although there is no concept of an account, taking all unspent outputs corresponding to an address gives the balance of that address.
    • Advantages
      • Precision: The structure similar to double-entry bookkeeping allows for very accurate recording of all asset flows.
      • Privacy protection and resistance to quantum attacks: If users frequently change addresses.
      • Stateless: Leaves room for improving concurrency.
      • Avoids replay attacks: Because replaying will not find the corresponding UTXO for the input.
    • Disadvantages
      • Records all transactions, complex, consumes storage space.
      • Traversing UTXOs takes time.
  • Account/Balance, such as Ethereum, has three main maps: account map, transaction map, transaction receipts map. Specifically, to reduce space and prevent tampering, it uses a Merkle Patricia Trie (MPT).
    • Advantages
      • Space-efficient: Unlike UTXO, a transaction connects multiple UTXOs.
      • Simplicity: Complexity is offloaded to the script.
    • Disadvantages
      • Requires using nonce to solve replay issues since there is no dependency between transactions.

It is worth mentioning that the "block + chain" data structure is essentially an append-only Merkle tree, also known as a hash tree.

Storage

Since UTXO or MPT structures serve as indexes, and to simplify operations for each node in a distributed environment, data persistence typically favors in-process databases that can run directly with the node's program, such as LevelDB or RocksDB.

Because these indexes are not universal, you cannot query them like an SQL database, which raises the barrier for data analysis. Optimizations require a dedicated indexing service, such as Etherscan.

Protocol Layer

Now that we have a functional base layer, we need a more general protocol layer for logical operations above this layer. Depending on the blockchain's usage requirements, specific logical processing modules can be plugged in and out like a microkernel architecture.

For instance, the most common accounting: upon receiving some transactions at the latest block height, organize them to establish the data structure as mentioned in the previous layer.

Writing a native module for each business logic and updating all nodes' code is not very realistic. Can we decouple this layer using virtualization? The answer is a virtual machine capable of executing smart contract code. In a non-trusting environment, we cannot allow clients to execute code for free, so the most unique feature of this virtual machine may be billing.

The difference between contract-based tokens, such as ERC20, and native tokens leads to complications when dealing with different tokens, resulting in the emergence of Wrapped Ether tokens.

Consensus Layer

After the protocol layer computes the execution results, how do we reach consensus with other nodes? There are several common mechanisms to incentivize cooperation:

  • Proof of Work (POW): Mining tokens through hash collisions, which is energy-intensive and not environmentally friendly.
  • Proof of Stake (POS): Mining tokens using staked tokens.
  • Delegated Proof-of-Stake (DPOS): Electing representatives to mine tokens using staked tokens.

Based on the incentive mechanism, the longest chain among nodes is followed; if two groups dislike each other, a fork occurs.

Additionally, there are consensus protocols that help everyone reach agreement (i.e., everyone either does something together or does nothing together):

  • 2PC: Everyone relies on a coordinator: the coordinator asks everyone: should we proceed? If anyone replies no, the coordinator tells everyone "no"; otherwise, everyone proceeds. This dependency can lead to issues if the coordinator fails in the middle of the second phase, leaving some nodes unsure of what to do with the block, requiring manual intervention to restart the coordinator.
  • 3PC: To solve the above problem, an additional phase is added to ensure everyone knows whether to proceed before doing so; if an error occurs, a new coordinator is selected.
  • Paxos: The above 2PC and 3PC both rely on a coordinator; how can we eliminate this coordinator? By using "the majority (at least f+1 in 2f + 1)" to replace it, as long as the majority agrees in two steps, consensus can be achieved.
  • PBFT (deterministic 3-step protocol): The fault tolerance of the above methods is still not high enough, leading to the development of PBFT. This algorithm ensures that the majority (2/3) of nodes either all agree or all disagree, implemented through three rounds of voting, with at least a majority (2/3) agreeing in each round before committing the block in the final round.

In practical applications, relational databases mostly use 2PC or 3PC; variants of Paxos include implementations in Zookeeper, Google Chubby distributed locks, and Spanner; in blockchain, Bitcoin and Ethereum use POW, while the new Ethereum uses POS, and IoTeX and EOS use DPOS.

API Layer

See Public API choices

Designing Human-Centric Internationalization (i18n) Engineering Solutions

· 9 min read

Requirement Analysis

If you ask what the biggest difference is between Silicon Valley companies and those in China, the answer is likely as Wu Jun said: Silicon Valley companies primarily target the global market. As Chen Zhiwu aptly stated, the ability to create wealth can be measured by three dimensions: depth, which refers to productivity—the ability to provide better products or services in the same amount of time; length, which refers to the ability to leverage finance to exchange value across time and space; and breadth, which refers to market size—the ability to create markets or new industries that transcend geographical boundaries. Internationalization, which is the localization of products and services in terms of language and culture, is indeed a strategic key for multinational companies competing in the global market.

Internationalization, abbreviated as i18n (with 18 letters between the 'i' and the 'n'), aims to solve the following issues in the development of websites and mobile apps:

  1. Language
  2. Time and Time Zones
  3. Numbers and Currency

Framework Design

Language

Logic and Details

The essence of language is as a medium for delivering messages to the audience; different languages serve as different media, each targeting different audiences. For example, if we want to display the message to the user: "Hello, Xiaoli!", the process involves checking the language table, determining the user's language, and the current required interpolation, such as the name, to display the corresponding message:

Message CodesLocalesTranslations
home.helloenHello, ${username}!
home.hellozh-CN你好, ${username}!
home.helloIW!${username}, שלום

Different languages may have slight variations in details, such as the singular and plural forms of an item, or the distinction between male and female in third-person references.

These are issues that simple table lookups cannot handle, requiring more complex logic processing. In code, you can use conditional statements to handle these exceptions. Additionally, some internationalization frameworks invent Domain-Specific Languages (DSL) to specifically address these situations. For example, The Project Fluent:

Another issue that beginners often overlook is the direction of writing. Common languages like Chinese and English are written from left to right, while some languages, such as Hebrew and Arabic, are written from right to left.

The difference in writing direction affects not only the text itself but also the input method. A Chinese person would find it very strange to input text from right to left; conversely, a Jewish colleague of mine finds it easy to mix English and Hebrew input.

Layout is another consideration. The entire UI layout and visual elements, such as the direction of arrows, may change based on the language's direction. Your HTML needs to set the appropriate dir attribute.

How to Determine the User's Locale?

You may wonder how we know the user's current language settings. In the case of a browser, when a user requests a webpage, there is a header called Accept-Language that indicates the accepted languages. These settings come from the user's system language and browser settings. In mobile apps, there is usually an API to retrieve the locale variable or constant. Another method is to determine the user's location based on their IP or GPS information and then display the corresponding language. For multinational companies, users often indicate their language preferences and geographical regions during registration.

If a user wants to change the language, websites have various approaches, while mobile apps tend to have more fixed APIs. Here are some methods for websites:

  1. Set a locale cookie
  2. Use different subdomains
  3. Use a dedicated domain. Pinterest has an article discussing how they utilize localized domains. Research shows that using local domain suffixes leads to higher click-through rates.
  4. Use different paths
  5. Use query parameters. While this method is feasible, it is not SEO-friendly.

Beginners often forget to mark the lang attribute in HTML when creating websites.

Translation Management Systems

Once you have carefully implemented the display of text languages, you will find that establishing and managing a translation library is also a cumbersome process.

Typically, developers do not have expertise in multiple languages. At this point, external translators or pre-existing translation libraries need to be introduced. The challenge here is that translators are often not technical personnel. Allowing them to directly modify code or communicate directly with developers can significantly increase translation costs. Therefore, in Silicon Valley companies, translation management systems (TMS) designed for translators are often managed by a dedicated team or involve purchasing existing solutions, such as the closed-source paid service lokalise.co or the open-source Mozilla Pontoon. A TMS can uniformly manage translation libraries, projects, reviews, and task assignments.

This way, the development process becomes: first, designers identify areas that need attention based on different languages and cultural habits during the design phase. For example, a button that is short in English may be very long in Russian, so care must be taken to avoid overflow. Then, the development team implements specific code logic based on the design requirements and provides message codes, contextual background, and examples written in a language familiar to developers in the translation management system. Subsequently, the translation team fills in translations for various languages in the management system. Finally, the development team pulls the translation library back into the codebase and releases it into the product.

Contextual background is an easily overlooked and challenging aspect. Where in the UI is the message that needs translation? What is its purpose? If the message is too short, further explanation may be needed. With this background knowledge, translators can provide more accurate translations in other languages. If translators cannot fully understand the intended message, they need a feedback channel to reach out to product designers and developers for clarification.

Given the multitude of languages and texts, it is rare for a single translator to handle everything; it typically requires a team of individuals with language expertise from various countries to contribute to the translation library. The entire process is time-consuming and labor-intensive, which is why translation teams are often established, such as outsourcing to Smartling.

Now that we have the code logic and translation library, the next question is: how do we integrate the content of the translation library into the product?

There are many different implementation methods; the most straightforward is a static approach where, each time an update occurs, a diff is submitted and merged into the code. This way, relevant translation materials are already included in the code during the build process.

Another approach is dynamic integration. On one hand, you can "pull" content from a remote translation library, which may lead to performance issues during high website traffic. However, the advantage is that translations are always up-to-date. On the other hand, for optimization, a "push" method can be employed, where any new changes in the translation library trigger a webhook to push the content to the server.

In my view, maintaining translations is more cumbersome than adding them. I have seen large projects become chaotic because old translations were not promptly removed after updates, leading to an unwieldy translation library. A good tool that ensures data consistency would greatly assist in maintaining clean code.

Alibaba's Kiwi internationalization solution has implemented a linter and VS Code plugin to help you check and extract translations from the code.

Time and Time Zones

Having discussed language, the next topic is time and time zones. As a global company, much of the data comes from around the world and is displayed to users globally. For example, how do international flights ensure that start and end times are consistent globally and displayed appropriately across different time zones? This is crucial. The same situation applies to all time-related events, such as booking hotels, reserving restaurants, and scheduling meetings.

First, there are several typical representations of time:

  1. Natural language, such as 07:23:01, Monday 28, October 2019 CST AM/PM
  2. Unix timestamp (Int type), such as 1572218668
  3. Datetime. Note that when MySQL stores datetime, it converts it to UTC based on the server's time zone and stores it, converting it back when reading. However, the server's time zone is generally set to UTC. In this case, the storage does not include time zone information, defaulting to UTC.
  4. ISO Date, such as 2019-10-27T23:24:28+00:00, which includes time zone information.

I have no strong preference for these formats; if you have relevant experience, feel free to discuss it.

When displaying time, two types of conversions may occur: one is converting the stored server time zone to the local time zone for display; the other is converting machine code to natural language. A popular approach for the latter is to use powerful libraries for handling time and dates, such as moment.js and dayjs.

Numbers and Currency

The display of numbers varies significantly across different countries and regions. The meaning of commas and periods in numbers differs from one country to another.

(1000.1)
.toLocaleString("en")(
// => "1,000.1"
1000.1,
)
.toLocaleString("de")(
// => "1.000,1"
1000.1,
)
.toLocaleString("ru");
// => "1 000,1"

Arabic numerals are not universally applicable; for instance, in Java's String.format, the digits 1, 2, 3 are represented as ١، ٢، ٣ in actual Arabic language.

Regarding pricing, should the same goods be displayed in local currency values in different countries? What is the currency symbol? How precise should the currency be? These questions must be addressed in advance.

Conclusion

The internationalization tools mentioned in this article include translation management systems, the open-source Mozilla Pontoon, the closed-source paid service lokalise.co, POEditor.com, and so on. For code consistency, Alibaba's Kiwi internationalization solution is recommended. For UI display, consider using moment.js and day.js.

Like all software system development, there is no silver bullet for internationalization; great works are crafted through foundational skills honed over time.

Designing a Load Balancer

· 4 min read

Requirements Analysis

Internet services often need to handle traffic from around the world, but a single server can only serve a limited number of requests at the same time. Therefore, we typically have a server cluster to collectively manage this traffic. The question arises: how can we evenly distribute this traffic across different servers?

From the user to the server, there are many nodes and load balancers at different levels. Specifically, our design requirements are:

  • Design a Layer 7 load balancer located internally in the data center.
  • Utilize real-time load information from the backend.
  • Handle tens of millions of requests per second and a throughput of 10 TB per second.

Note: If Service A depends on Service B, we refer to A as the downstream service and B as the upstream service.

Challenges

Why is load balancing difficult? The answer lies in the challenge of collecting accurate load distribution data.

Count-based Distribution ≠ Load-based Distribution

The simplest approach is to distribute traffic randomly or in a round-robin manner based on the number of requests. However, the actual load is not calculated based on the number of requests; for example, some requests are heavy and CPU-intensive, while others are lightweight.

To measure load more accurately, the load balancer must maintain some local state—such as the current number of requests, connection counts, and request processing delays. Based on this state, we can employ appropriate load balancing algorithms—least connections, least latency, or random N choose one.

Least Connections: Requests are directed to the server with the fewest current connections.

Least Latency: Requests are directed to the server with the lowest average response time and fewest connections. Servers can also be weighted.

Random N Choose One (N is typically 2, so we can also refer to it as the power of two choices): Randomly select two servers and choose the better of the two, which helps avoid the worst-case scenario.

Distributed Environment

In a distributed environment, local load balancers struggle to understand the complete state of upstream and downstream services, including:

  • Load of upstream services
  • Upstream services can be very large, making it difficult to select an appropriate subset for the load balancer
  • Load of downstream services
  • The specific processing time for different types of requests is hard to predict

Solutions

There are three approaches to accurately collect load information and respond accordingly:

  • A centralized balancer that dynamically manages based on the situation
  • Distributed balancers that share state among them
  • Servers return load information along with requests, or the balancer actively queries the servers

Dropbox chose the third approach when implementing Bandai, as it adapted well to the existing random N choose one algorithm.

However, unlike the original random N choose one algorithm, this approach does not rely on local state but instead uses real-time results returned by the servers.

Server Utilization: Backend servers set a maximum load, track current connections, and calculate utilization, ranging from 0.0 to 1.0.

Two issues need to be considered:

  1. Error Handling: If fail fast, the quick processing may attract more traffic, leading to more errors.
  2. Data Decay: If a server's load is too high, no requests will be sent there. Therefore, using a decay function similar to a reverse S-curve ensures that old data is purged.

Result: Requests Received by Servers are More Balanced

Lyft's Marketing Automation Platform Symphony

· 3 min read

Customer Acquisition Efficiency Issue: How can advertising campaigns achieve higher returns with less money and fewer people?

Specifically, Lyft's advertising campaigns need to address the following characteristics:

  1. Manage location-based campaigns
  2. Data-driven growth: growth must be scalable, measurable, and predictable
  3. Support Lyft's unique growth model, as shown below:

lyft growth model

The main challenge is the difficulty of scaling management across various aspects of regional marketing, including ad bidding, budgeting, creative assets, incentives, audience selection, testing, and more. The following image depicts a day in the life of a marketer:

A Day in the Life of a Marketer

We can see that "execution" takes up most of the time, while less time is spent on the more important tasks of "analysis and decision-making." Scaling means reducing complex operations and allowing marketers to focus on analysis and decision-making.

Solution: Automation

To reduce costs and improve the efficiency of experimentation, it is necessary to:

  1. Predict whether new users are interested in the product
  2. Optimize across multiple channels and effectively evaluate and allocate budgets
  3. Conveniently manage thousands of campaigns

Data is enhanced through Lyft's Amundsen system using reinforcement learning.

The automation components include:

  1. Updating bid keywords
  2. Disabling underperforming creative assets
  3. Adjusting referral values based on market changes
  4. Identifying high-value user segments
  5. Sharing strategies across multiple campaigns

Architecture

Lyft Symphony Architecture

Technology stack: Apache Hive, Presto, ML platform, Airflow, 3rd-party APIs, UI.

Specific Component Modules

LTV Prediction Module

The lifetime value (LTV) of users is an important metric for evaluating channels, and the budget is determined by both LTV and the price we are willing to pay for customer acquisition in that region.

Our understanding of new users is limited, but as interactions increase, the historical data provided will more accurately predict outcomes.

Initial feature values:

Feature Values

As historical interaction records accumulate, the predictions become more accurate:

Predicting LTV Based on Historical Records

Budget Allocation Module

Once LTV is established, the next step is to set the budget based on pricing. A curve of the form LTV = a * (spend)^b is fitted, along with similar parameter curves in the surrounding range. Achieving a global optimum requires some randomness.

Budget Calculation

Delivery Module

This module is divided into two parts: the parameter tuner and the executor. The tuner sets specific parameters based on pricing for each channel, while the executor applies these parameters to the respective channels.

There are many popular delivery strategies that are common across various channels:

Delivery Strategies

Conclusion

It is essential to recognize the importance of human experience within the system; otherwise, it results in garbage in, garbage out. When people are liberated from tedious delivery tasks and can focus on understanding users, channels, and the messages they need to convey to their audience, they can achieve better campaign results—spending less time to achieve higher ROI.

How to write solid code?

· One min read

he likes it

  1. empathy / perspective-taking is the most important.

    1. realize that code is written for human to read first and then for machines to execute.
    2. software is so "soft" and there are many ways to achieve one thing. It's all about making the proper trade-offs to fulfill the requirements.
    3. Invent and Simplify: Apple Pay RFID vs. Wechat Scan QR Code.
  2. choose a sustainable architecture to reduce human resources costs per feature.

  1. adopt patterns and best practices.

  2. avoid anti-patterns

    • missing error-handling
    • callback hell = spaghetti code + unpredictable error handling
    • over-long inheritance chain
    • circular dependency
    • over-complicated code
      • nested ternary operation
      • comment out unused code
    • missing i18n, especially RTL issues
    • don't repeat yourself
      • simple copy-and-paste
      • unreasonable comments
  3. effective refactoring

    • semantic version
    • never introduce breaking change to non major versions
      • two legged change

Introduction to Architecture

· 3 min read

What is Architecture?

Architecture is the shape of a software system. To illustrate with a building:

  • Paradigm is the bricks.
  • Design principles are the rooms.
  • Components are the structure.

They all serve a specific purpose, just like hospitals treat patients and schools educate students.

Why Do We Need Architecture?

Behavior vs. Structure

Every software system provides two distinct values to stakeholders: behavior and structure. Software developers must ensure that both values are high.

==Due to the nature of their work, software architects focus more on the structure of the system rather than its features and functions.==

Ultimate Goal — ==Reduce the human resource costs required for adding new features==

Architecture serves the entire lifecycle of software systems, making them easy to understand, develop, test, deploy, and operate. Its goal is to minimize the human resource costs for each business use case.

O'Reilly's "Software Architecture" provides a great introduction to these five fundamental architectures.

1. Layered Architecture

Layered architecture is widely adopted and well-known among developers. Therefore, it is the de facto standard at the application level. If you are unsure which architecture to use, layered architecture is a good choice.

Examples:

  • TCP/IP model: Application Layer > Transport Layer > Internet Layer > Network Interface Layer
  • Facebook TAO: Network Layer > Cache Layer (follower + leader) > Database Layer

Pros and Cons:

  • Pros
    • Easy to use
    • Clear responsibilities
    • Testability
  • Cons
    • Large and rigid
      • Adjusting, extending, or updating the architecture requires changes across all layers, which can be quite tricky.

2. Event-Driven Architecture

Any change in state triggers an event in the system. Communication between system components is accomplished through events.

A simplified architecture includes a mediator, event queue, and channels. The diagram below illustrates a simplified event-driven architecture:

Examples:

  • QT: Signals and Slots
  • Payment infrastructure: As bank gateways often have high latency, asynchronous techniques are used in banking architecture.

3. Microkernel Architecture (aka Plug-in Architecture)

The functionality of the software is distributed between a core and multiple plugins. The core contains only the most basic functionalities. Each plugin operates independently and implements shared interfaces to achieve different goals.

Examples:

  • Visual Studio Code and Eclipse
  • MINIX operating system

4. Microservices Architecture

Large systems are decomposed into numerous microservices, each a separately deployable unit that communicates via RPCs.

uber architecture

Examples:

5. Space-Based Architecture

The name "Space-Based Architecture" comes from "tuple space," which implies a "distributed shared space." In space-based architecture, there are no databases or synchronized database access, thus avoiding database bottleneck issues. All processing units share copies of application data in memory. These processing units can be flexibly started and stopped.

Example: See Wikipedia

  • Primarily adopted by Java-based architectures: for example, JavaSpaces.

Toutiao Recommendation System: P2 Content Analysis

· 3 min read

In Toutiao Recommendation System: P1 Overview, we know that content analysis and data mining of user tags are the cornerstones of the recommendation system.

What is the content analysis?

content analysis = derive intermediate data from raw articles and user behaviors.

Take articles for example. To model user interests, we need to tag contents and articles. To associate a user with the interests of the “Internet” tag, we need to know whether a user reads an article with the “Internet” tag.

Why are we analyzing those raw data?

We do it for the reason of …

  1. Tagging users (user profile)
    • Tagging users who liked articles with “Internet” tag. Tagging users who liked articles with “xiaomi” tag.
  2. Recommending contents to users by tags
    • Pushing “meizu” contents to users with “meizu” tag. Pushing “dota” contents to users with “dota” tag.
  3. Preparing contents by topics
    • Put “Bundesliga” articles to “Bundesliga topic”. Put “diet” articles to “diet topic”.

Case Study: Analysis Result of an Article

Here is an example of “article features” page. There are article features like categorizations, keywords, topics, entities.

Analysis Result of an Article

Analysis Result of an Article: Details

What are the article features?

  1. Semantic Tags: Human predefine those tags with explicit meanings.

  2. Implicit Semantics, including topics and keywords. Topic features are describing the statistics of words. Certain rules generate keywords.

  3. Similarity. Duplicate recommendation once to be the most severe feedbacks we get from our customers.

  4. Time and location.

  5. Quality. Abusing, porn, ads, or “chicken soup for the soul”?

Article features are important

  • It is not true that a recommendation system cannot work at all without article features. Amazon, Walmart, Netflix can recommend by collaborative filtering.
  • However, in news product, users consume contents of the same day. Bootstrapping without article features is hard. Collaborative filtering cannot help with bootstrapping.
    • The finer of the granularity of the article feature, the better the ability to bootstrap.

More on Semantic Tags

We divide features of semantic tags into three levels:

  1. Categorizations: used in the user profile, filtering contents in topics, recommend recall, recommend features
  2. Concepts: used in filtering contents in topics, searching tags, recommend recall(like)
  3. Entities: used in filtering contents in topics, searching tags, recommend recall(like)

Why dividing into different levels? We do this so that they can capture articles in different granularities.

  1. Categorizations: full in coverage, low in accuracy.
  2. Concepts: medium in coverage, medium in accuracy.
  3. Entities: low in coverage, high in accuracy. It only covers hot people, organizations, products in each area.

Categorizations and concepts are sharing the same technical infrastructure.

Why do we need semantic tags?

  • Implicit semantics
    • have been functioning well.
    • cost much less than semantic tags.
  • But, topics and interests need a clear-defined tagging system.
  • Semantic tags also evaluate the capability in NPL technology of a company.

Document classification

Classification hierarchy

  1. Root
  2. Science, sports, finance, entertainment
  3. Football, tennis, table tennis, track and field, swimming
  4. International, domestic
  5. Team A, team B

Classifiers:

  • SVM
  • SVM + CNN
  • SVM + CNN + RNN

Calculating relevance

  1. Lexical analysis for articles
  2. Filtering keywords
  3. Disambiguation
  4. Calculating relevance

Stream and Batch Processing Frameworks

· 2 min read

Why Do We Need Such Frameworks?

  • To process more data in a shorter amount of time.
  • To unify fault tolerance in distributed systems.
  • To simplify task abstractions to meet changing business requirements.
  • Suitable for bounded datasets (batch processing) and unbounded datasets (stream processing).

Brief History of Batch and Stream Processing Development

  1. Hadoop and MapReduce. Google made batch processing as simple as MapReduce result = pairs.map((pair) => (morePairs)).reduce(somePairs => lessPairs) in a distributed system.
  2. Apache Storm and directed graph topologies. MapReduce does not represent iterative algorithms well. Therefore, Nathan Marz abstracted stream processing into a graph structure composed of spouts and bolts.
  3. Spark in-memory computation. Reynold Xin pointed out that Spark uses ten times fewer machines than Hadoop while being three times faster when processing the same data.
  4. Google Dataflow based on Millwheel and FlumeJava. Google uses a windowed API to support both batch and stream processing simultaneously.
  1. Flink quickly adopted the programming model of ==Google Dataflow== and Apache Beam.
  2. Flink's efficient implementation of the Chandy-Lamport checkpointing algorithm.

These Frameworks

Architecture Choices

To meet the above demands with commercial machines, there are several popular distributed system architectures...

  • Master-slave (centralized): Apache Storm + Zookeeper, Apache Samza + YARN
  • P2P (decentralized): Apache S4

Features

  1. DAG Topology for iterative processing - for example, GraphX in Spark, topologies in Apache Storm, DataStream API in Flink.
  2. Delivery Guarantees. How to ensure the reliability of data delivery between nodes? At least once / at most once / exactly once.
  3. Fault Tolerance. Implement fault tolerance using cold/warm/hot standby, checkpointing, or active-active.
  4. Windowed API for unbounded datasets. For example, streaming windows in Apache. Window functions in Spark. Windowing in Apache Beam.

Comparison Table of Different Architectures

ArchitectureStormStorm-tridentSparkFlink
ModelNativeMicro-batchMicro-batchNative
GuaranteesAt least onceExactly onceExactly onceExactly once
Fault ToleranceRecord AckRecord AckCheckpointCheckpoint
Maximum Fault ToleranceHighMediumMediumLow
LatencyVery lowHighHighLow
ThroughputLowMediumHighHigh

Toutiao Recommendation System: P1 Overview

· 3 min read

What are we optimizing for? User Satisfaction

We are finding the best function below to maximize user satisfaction .

user satisfaction = function(content, user profile, context)
  1. Content: features of articles, videos, UGC short videos, Q&As, etc.
  2. User profile: interests, occupation, age, gender and behavior patterns, etc.
  3. Context: Mobile users in contexts of workspace, commuting, traveling, etc.

How to evaluate the satisfaction?

  1. Measurable Goals, e.g.

    • click through rate
    • Session duration
    • upvotes
    • comments
    • reposts
  2. Hard-to-measurable Goals:

    • Frequency control of ads and special-typed contents (Q&A)
    • Frequency control of vulgar content
    • Reducing clickbait, low quality, disgusting content
    • Enforcing / pining / highly-weighting important news
    • Lowly-weighting contents from low-level accounts

How to optimize for those goals? Machine Learning Models

It is a typical supervised machine learning problem to find the best function above. To implement the system, we have these algorithms:

  1. Collaborative Filtering
  2. Logistic Regression
  3. DNN
  4. Factorization Machine
  5. GBDT

A world-class recommendation system is supposed to have the flexibility to A/B-test and combine multiple algorithms above. It is now popular to combine LR and DNN. Facebook used both LR and GBDT years ago.

How do models observe and measure the reality? Feature engineering

  1. Correlation, between content’s characteristic and user’s interest. Explicit correlations include keywords, categories, sources, genres. Implicit correlations can be extract from user’s vector or item’s vector from models like FM.

  2. Environmental features such as geo location, time. It’s can be used as bias or building correlation on top of it.

  3. Hot trend. There are global hot trend, categorical hot trend, topic hot trend and keyword hot trend. Hot trend is very useful to solve cold-start issue when we have little information about user.

  4. Collaborative features, which helps avoid situation where recommended content get more and more concentrated. Collaborative filtering is not analysing each user’s history separately, but finding users’ similarity based on their behaviour by clicks, interests, topics, keywords or event implicit vectors. By finding similar users, it can expand the diversity of recommended content.

Large-scale Training in Realtime

  • Users like to see news feed updated in realtime according to what we track from their actions.
  • Use Apache storm to train data (clicks, impressions, faves, shares) in realtime.
  • Collect data to a threshold and then update to the recommendation model
  • Store model parameters , like tens of billions of raw features and billions of vector features, in high performance computing clusters.

They are implemented in the following steps:

  1. Online services record features in realtime.
  2. Write data into Kafka
  3. Ingest data from Kafka to Storm
  4. Populate full user profiles and prepare samples
  5. Update model parameters according to the latest samples
  6. Online modeling gains new knowledge

How to further reducing the latency? Recall Strategy

It is impossible to predict all the things with the model, considering the super-large scale of all the contents. Therefore, we need recall strategies to focus on a representative subset of the data. Performance is critical here and timeout is 50ms.

recall strategy

Among all the recall strategies, we take the InvertedIndex<Key, List<Article>> .

The Key can be topic, entity, source, etc.

Tags of InterestsRelevanceList of Documents
E-commerce0.3
Fun0.2
History0.2
Military0.1

Data Dependencies

  • Features depends on tags of user-side and content-side.
  • recall strategy depends on tags of user-side and content-side.
  • content analysis and data mining of user tags are cornerstone of the recommendation system.

Designing Uber Ride-Hailing Service

· 2 min read

Disclaimer: All content below is sourced from public resources or purely original. No confidential information regarding Uber is included here.

Requirements

  • Provide services for the global transportation market
  • Large-scale real-time scheduling
  • Backend design

Architecture

uber architecture

Why Microservices?

==Conway's Law== The structure of a software system corresponds to the organizational structure of the company.

Monolithic ==Service==Microservices
When team size and codebase are small, productivity✅ High❌ Low
==When team size and codebase are large, productivity==❌ Low✅ High (Conway's Law)
==Quality requirements for engineering==❌ High (Inadequately skilled developers can easily disrupt the entire system)✅ Low (Runtime is isolated)
Dependency version upgrades✅ Fast (Centralized management)❌ Slow
Multi-tenant support / Production-staging state isolation✅ Easy❌ Difficult (Each service must 1) either establish a staging environment connected to other services in staging 2) or support multi-tenancy across request contexts and data storage)
Debuggability, assuming the same modules, parameters, logs❌ Low✅ High (if distributed tracing is available)
Latency✅ Low (Local)❌ High (Remote)
DevOps costs✅ Low (High cost of build tools)❌ High (Capacity planning is difficult)

Combining monolithic ==codebase== and microservices can leverage the strengths of both.

Scheduling Service

  • Consistent hash addresses provided by geohash
  • Data is transient in memory, so no need for duplication. (CAP: AP over CP)
  • Use single-threaded or locked sharding to prevent double scheduling

Payment Service

==The key is to have asynchronous design==, as ACID transaction payment systems across multiple systems often have very long latencies.

User Profile Service and Trip Record Service

  • Use caching to reduce latency
  • As 1) support for more countries and regions increases 2) user roles (drivers, riders, restaurant owners, diners, etc.) gradually expand, providing user profile services for these users also faces significant challenges.

Notification Push Service

  • Apple Push Notification Service (unreliable)
  • Google Cloud Messaging (GCM) (can detect successful delivery) or
  • SMS services are generally more reliable

A Closer Look at iOS Architecture Patterns

· 3 min read

Why Should We Care About Architecture?

The answer is: to reduce the human resources spent on each feature.

Mobile developers evaluate the quality of an architecture on three levels:

  1. Whether the responsibilities of different features are evenly distributed
  2. Whether it is easy to test
  3. Whether it is easy to use and maintain
Responsibility DistributionTestabilityUsability
Tight Coupling MVC
Cocoa MVC❌ V and C are coupled✅⭐
MVP✅ Independent view lifecycleAverage: more code
MVVMAverage: View has a dependency on UIKitAverage
VIPER✅⭐️✅⭐️

Tight Coupling MVC

Traditional MVC

For example, in a multi-page web application, when you click a link to navigate to another page, the entire page reloads. The problem with this architecture is that the View is tightly coupled with the Controller and Model.

Cocoa MVC

Cocoa MVC is the architecture recommended by Apple for iOS developers. Theoretically, this architecture allows the Controller to decouple the Model from the View.

Cocoa MVC

However, in practice, Cocoa MVC encourages the use of massive view controllers, ultimately leading to the view controller handling all operations.

Realistic Cocoa MVC

Although testing such tightly coupled massive view controllers is quite difficult, Cocoa MVC performs the best in terms of development speed among the existing options.

MVP

In MVP, the Presenter has no relationship with the lifecycle of the view controller, allowing the view to be easily replaced. We can think of UIViewController as the View.

Variant of MVC

There is another type of MVP: MVP with data binding. As shown in the figure, the View is tightly coupled with the Model and Controller.

MVP

MVVM

MVVM is similar to MVP, but MVVM binds the View to the View Model.

MVVM

VIPER

Unlike the three-layer structure of MV(X), VIPER has a five-layer structure (VIPER View, Interactor, Presenter, Entity, and Routing). This structure allows for good responsibility distribution but has poorer maintainability.

VIPER

Compared to MV(X), VIPER has the following differences:

  1. The logic processing of the Model is transferred to the Interactor, so Entities have no logic and are purely data storage structures.
  2. ==UI-related business logic is handled in the Presenter, while data modification functions are handled in the Interactor==.
  3. VIPER introduces a routing module, Router, to implement inter-module navigation.

How to Stream Video to Mobile Devices Using HTTP? HTTP Live Streaming (HLS)

· One min read

Why Is Such a Protocol Needed?

Mobile video playback services using HTTP Live Streaming encounter the following issues:

  1. ==Limited memory and storage on mobile devices==.
  2. Due to unstable network connections and varying bandwidth, there is a need to ==dynamically adjust video quality during transmission==.

Solutions

  1. Server Side: In a typical setup, encoding hardware receives audio and video input, encodes it into H.264 format video and AAC format audio, and then streams it out in MPEG-2 format.
    1. A software multiplexer then splits the raw output stream into a series of short media files (with lengths possibly around 10 seconds in .ts format).
    2. The multiplexer also maintains an index file (.m3u8 format) that contains a list of all media files.
    3. The generated media files and index file are published on a web server.
  2. Client Side: The client reads the index, sequentially requests the necessary media files from the server, and smoothly plays the content of each short media file.

Architecture

HLS Architecture