When you know where you should go, it is too late to go there; if you always keep your original path, you will miss the road to the future.
Charles Handy makes an analogy as his road to Davy’s Bar. Turn right and go up the hill when there is half a mile to the Davy’s Bar. However, when he realized he was on the wrong way, he arrived at Davy’s Bar already.
The growth curve is usually in an “S” shape, and we call it S-curve or sigmoid curve. To keep the overall growth rate high, you have to develop your second S-curve before it is too late to invest your time and resources.
Intel’s CPU, Netflix’s video streaming, Nintendo’s gaming, Microsoft’s cloud are all excellent examples of the second-curve-driving businesses.
How to find and catch the second curve takes vision and execution. You have to input more information and continuously sort them to identify the best opportunities. And then, once a chance identified, you need a reliable team to fight the battle and figure out whether it really works.
What makes you succeed may not make you succeed again. There is always a limit to growth. The second curve theory helps us reflect on why and how to embrace the change and live a more thriving life.
Internet-scale web services deal with high-volume traffic from the whole world. However, one server could only serve a limited amount of requests at the same time. Consequently, there is usually a server farm or a large cluster of servers to undertake the traffic altogether. Here comes the question: how to route them so that each host could evenly receive and process the request?
Since there are many hops and layers of load balancers from the user to the server, specifically speaking, this time our design requirements are
Note: If Service A depends on (or consumes) Service B, then A is downstream service of B, and B is upstream service of A.
Why is it hard to balance loads? The answer is that it is hard to collect accurate load distribution stats and act accordingly.
Random and round-robin distribute the traffic by requests. However, the actual load is not per request - some are heavy in CPU or thread utilization, while some are lightweight.
To be more accurate on the load, load balancers have to maintain local states of observed active request number, connection number, or request process latencies for each backend server. And based on them, we can use distribution algorithms like Least-connections, least-time, and Random N choices:
Least-connections: a request is passed to the server with the least number of active connections.
latency-based (least-time): a request is passed to the server with the least average response time and least number of active connections, taking into account weights of servers.
However, these two algorithms work well only with only one load balancer. If there are multiple ones, there might have herd effect. That is to say; all the load balancers notice that one service is momentarily faster, and then all send requests to that service.
Random N choices (where N=2 in most cases / a.k.a Power of Two Choices): pick two at random and chose the better option of the two, avoiding the worse choice.
Local LB is unaware of global downstream and upstream states, including
There are three options to collect load the stats accurately and then act accordingly:
Dropbox Bandaid team chose the third option because it fits into their existing random N choices approach well.
However, instead of using local states, like the original random N choices do, they use real-time global information from the backend servers via the response headers.
Server utilization: Backend servers are configured with a max capacity and count the on-going requests, and then they have utilization percentage calculated ranging from 0.0 to 1.0.
There are two problems to consider:
The abstraction of this problem is to find documents by prefixes and terms in a very large number of elements. The solution leverages these four major data structures:
InvertedIndex<prefixes or terms, documents>: given any prefix, find all the document ids that contain the prefix.
for each document, prepare a BloomFilter<prefixes or terms>: with user typing more, we can quickly filter out documents that do not contain the latest prefixes or terms, by check with their bloom filters.
ForwardIndex<documents, prefixes or terms>: previous bloom filter may return false positives, and now we query the actual documents to reject them.
scorer(document):relevance: Each partition return all of its true hits and scores. And then we aggregate and rank.
In details, Lyft’s advertisements should meet requirements as below:
However, the biggest challenge is to manage all the processes of cross-region marketing at scale, which include choosing bids, budgets, creatives, incentives, and audiences, running A/B tests, and so on. You can see what occupies a day in the life of a digital marketer:
We can find out that execution occupies most of the time while analysis, thought as more important, takes much less time. A scaling strategy will enable marketers to concentrate on analysis and decision-making process instead of operational activities.
To reduce costs and improve experimental efficiency, we need to
The marketing performance data flows into the reinforcement-learning system of Lyft: Amundsen
The problems that need to be automated include:
The tech stack includes - Apache Hive, Presto, ML platform, Airflow, 3rd-party APIs, UI.
The lifetime value of a user is an important criterion to measure the efficiency of acquisition channels. The budget is determined together by LTV and the price we are willing to pay in that region.
Our knowledge of a new user is limited. The historical data can help us to predict more accurately as the user interacts with our services.
The forecast improves as the historical data of interactivity accumulates:
After LTV is predicted, the next is to estimate budgets based on the price. A curve of the form
LTV = a * (spend)^b is fit to the data. A degree of randomness will be injected into the cost-curve creation process in order to converge a global optimum.
Bidders are made up of two parts - the tuners and actors. The tuners decide exact channel-specific parameters based on the price. The actors communicate the actual bid to different channels.
Some popular bidding strategies, applied in different channels, are listed as below:
We have to value human experiences in the automation process; otherwise, the quality of the models may be “garbage in, garbage out”. Once saved from laboring tasks, marketers can focus more on understanding users, channels, and the messages they want to convey to audiences, and thus obtain better ad impacts. That’s how Lyft can achieve a higher ROI with less time and efforts.