3 Programming Paradigms
Structured programming is a discipline imposed upon the direct transfer of control. OO programming is a discipline imposed upon the indirect transfer of control. Functional programming is discipline imposed upon variable assignment.
4 Kinds of No-SQL
When reading data from a hard disk, a database join operation is time-consuming and 99% of the time is spent on disk seek. To optimize read performance, denormalization is introduced and four categories of NoSQL are here to help.
ACID vs BASE
ACID and BASE indicate different designing philosophy. ACID focuses on consistency over availability. In ACID, the C means that a transaction pre-serves all the database rules. Meanwhile, BASE focuses more on availability indicating the system is guaranteed to be available.
Authentication and Authorization in Microservices
design an auth solution that starts simple but could scale with the business, consider both security and user experiences, and talk about the future trends in this area
B tree vs. B+ tree
B-trees and B+ trees are the workhorses of on-disk databases. B-trees store data in every node; B+ trees keep leaves as a linked list and push all data to the leaves. That single structural difference drives all the practical trade-offs — range scans, cache behavior, and why almost every relational database uses a B+ tree.
Bloom Filter
A Bloom filter is a compact probabilistic set-membership data structure — ~10 bits per element for 1% false positives, zero false negatives. Notes on how it works, the math for picking parameters, where real systems use it (Cassandra, HBase, Chrome, CDNs), and when to reach for counting / cuckoo / quotient variants instead.
Cloud Design Patterns
There are three types of cloud design patterns. Availability patterns have health endpoint monitoring and throttling. Data management patterns have cache-aside and static content hosting. Security patterns have federated identity.
Concurrency Models
Five concurrency models — single-threaded async, shared-memory threads with locks, CSP, Actor Model, and Software Transactional Memory — compared on how they share state, how they compose, and what kinds of failure they produce. Notes on when each fits and why language choice often dictates the model more than the problem does.
Credit Card Processing System
How is your credit card processed? 5 Parties and 2 workflows.
Data Partition and Routing
The advantages of implementing data partition and routing are availability and read efficiency while consistency is the weakness. The routing abstract model is essentially two maps: key-partition map and partition-machine map.
Design Pinterest
Designing a KV store with external storage
Requirements
Designing a Load Balancer or Dropbox Bandaid
Large-scale web services deal with high-volume traffic, but one host could only serve a limited amount of requests. There is usually a server farm to take the traffic altogether. How to route them so that each host could evenly receive the request?
Designing a metric system
Requirements
Designing a URL shortener
If you are asked to design a system to take user-provided URLs and transform them to shortened URLs, what would you do? How would you allocate the shorthand URLs? How would you implement the redirect servers? How would you store the click stats?
Designing Airbnb or a hotel booking system
For guests and hosts, we store data with a relational database and build indexes to search by location, metadata, and availability. We can use external vendors for payment and remind the reservations with a priority queue.
Designing Facebook photo storage
Traditional NFS based design has metadata bottleneck: large metadata size limits the metadata hit ratio. Facebook photo storage eliminates the metadata by aggregating hundreds of thousands of images in a single haystack store file.
Designing Memcached or an in-memory KV store
Memcached = rich client + distributed servers + hash table + LRU. It features a simple server and pushes complexity to the client) and hence reliable and easy to deploy.
Designing Online Judge or Leetcode
An online judge is primarily a place where you can execute code remotely for educational or recruitment purposes. In this design, we focus on designing an OJ for interview preparation like Leetcode.
Designing payment webhook
Design a webhook that notifies the merchant when the payment succeeds. We need to aggregate the metrics (e.g., success vs. failure) and display it on the dashboard.
Designing Smart Notification of Stock Price Changes
Requirements
Designing Square Cash or PayPal Money Transfer System
Design a money-transfer backend system that can receive, send, and payout. It should cover issues like scaling, internationalization, Deduplication, single-point failure, strong consistency.
Designing Stock Exchange
Requirements
Designing typeahead search or autocomplete
How to design a realtime typeahead autocomplete service? Linkedin's Cleo lib answers with a multi-layer architecture (browser cache / web tier / result aggregator / various typeahead backend) and 4 elements (inverted / forward index, bloom filter, scorer).
Designing Uber
Disclaimer: All things below are collected from public sources or purely original. No Uber-confidential stuff here.
Enterprise Authorization Services 2022
Authorization determines whether an individual or system has the right to access a particular resource. And this process is a typical scenario that could be automated with software. We will review Google's Zanzibar, Zanzibar-inspired solutions and other AuthZ services on the market.
Experience Deep Dive
For those who had little experience in leadership positions, we have some tips for interviews. It is necessary to describe your previous projects including challenges or improvements. Also, remember to demonstrate your communication skills.
Fraud Detection with Semi-supervised Learning
Fraud Detection fights against account takeovers and Botnet attacks during login. Semi-supervised learning has better learning accuracy than unsupervised learning and less time and costs than supervised learning.
How Facebook Scale its Social Graph Store? TAO
Before Tao, Facebook used the cache-aside pattern to scale its social graph store. There were three problems: list update operation is inefficient; clients have to manage cache and hard to offer read-after-write consistency. With Tao, these problems are solved.
How Netflix Serves Viewing Data?
Motivation
How to design robust and predictable APIs with idempotency?
APIs can be un-robust and un-predictable. To solve the problem, three principles should be observed. The client retries to ensure consistency. Retry with idempotency, exponential backoff, and random jitter.
How to scale a web service?
The AKF Scale Cube is the mental model for scaling a web service along three axes — cloning (X), functional decomposition (Y), and data partitioning (Z). Notes on when each axis matters, the order to apply them in, and the operational cost of each.
How to stream video over HTTP for mobile devices? HTTP Live Streaming (HLS)
Video service over Http for mobile devices has two problems: limited memory or storage and unstable network connection and variable bandwidth. HTTP live streaming solve this with separation of concerns, file segmentation, and indexing.
Improving availability with failover
To improve availability with failover, there are serval ways to achieve the goal such as cold standby, hot standby, warm standby, checkpointing and all active.
Intro to Relational Database
The relational database is the default choice for most storage use cases, by reason of atomicity, consistency, isolation, and durability. How is consistency here different from the one in CAP theorem? Why do we need 3NF and DB proxy?
Introduction to Architecture
Architecture serves the full lifecycle of the software system to make it easy to understand, develop, test, deploy and operate. The O’Reilly book Software Architecture Patterns gives a simple but effective introduction to five fundamental architectures.
iOS Architecture Patterns Revisited
Architecture can directly impact costs per feature. Let's compare Tight-coupling MVC, Cocoa MVC, MVP, MVVM, and VIPER in three dimensions: balanced distribution of responsibility among feature actors, testability and ease of use and maintainability.
Key value cache
The key-value cache is used to reduce the latency of data access. What are read-through, write-through, write-behind, write-back, write-behind, and cache-aside patterns?
Lambda Architecture
Lambda architecture = CQRS (batch layer + serving layer) + speed layer. It solves accuracy, latency, throughput problems of big data.
Load Balancer Types
Three categories of load balancer — DNS round robin, L3/L4 network load balancer, and L7 application load balancer. Notes on what each actually does at the packet level, which algorithms (round robin, least connections, consistent hashing) fit which use case, and where health checks and sticky sessions break.
Lyft's Marketing Automation Platform -- Symphony
To achieve a higher ROI in advertising, Lyft launched a marketing automation platform, which consists of three main components: lifetime value forecaster, budget allocator, and bidders.
Public API Choices
There are several tools for the public API, API gateway or Backend for Frontend gateway. GraphQL distinguishes itself from others for its features like tailing results, batching nested queries, performance tracing, and explicit caching.
Replica, Consistency, and CAP theorem
Any networked system has three desirable properties: consistency, availability and partition tolerance. Systems can have only two of those three. For example, RDBMS prefers consistency and partition tolerance and becomes an ACID system.
Skiplist
A skip list is a probabilistic, multi-level linked list that supports O(log n) search, insert, and delete. Used by LevelDB's MemTable, Redis's Sorted Set, and Lucene's inverted index. Notes on how it works, why it's chosen over balanced BSTs, and the concrete trade-offs in each real-world implementation.
SOLID Design Principles
SOLID is an acronym of design principles that help software engineers write solid code. S is for single responsibility principle, O for open/closed principle, L for Liskov’s substitution principle, I for interface segregation principle and D for dependency inversion principle.
Stream and Batch Processing Frameworks
Stream and Batch processing frameworks can process high throughput at low latency. Why is Flink gaining popularity? And how to make an architectural choice among Storm, Storm-trident, Spark, and Flink?
Toutiao Recommendation System: P1 Overview
In order to evaluate user satisfaction, machine learning models are implemented. These models observe and measure the reality by feature engineering and further reduce latencies by recall strategy.
What can we communicate in soft skills interview?
An interview is a process for workers to find future co-workers. The candidate will be evaluated based on answers to three key questions: capability, willingness, and culture-fit. Any question above can not be answered without good communication.
What is Apache Kafka?
Apache Kafka is a distributed streaming platform, which can be used for logging by topics, messaging system geo-replication or stream processing. It is much faster than other platforms due to its zero-copy technology.