Designing Uber Ride-Hailing Service
Disclaimer: All content below is sourced from public resources or purely original. No confidential information regarding Uber is included here.
Requirements
- Provide services for the global transportation market
- Large-scale real-time scheduling
- Backend design
Architecture
Why Microservices?
==Conway's Law== The structure of a software system corresponds to the organizational structure of the company.
Monolithic ==Service== | Microservices | |
---|---|---|
When team size and codebase are small, productivity | ✅ High | ❌ Low |
==When team size and codebase are large, productivity== | ❌ Low | ✅ High (Conway's Law) |
==Quality requirements for engineering== | ❌ High (Inadequately skilled developers can easily disrupt the entire system) | ✅ Low (Runtime is isolated) |
Dependency version upgrades | ✅ Fast (Centralized management) | ❌ Slow |
Multi-tenant support / Production-staging state isolation | ✅ Easy | ❌ Difficult (Each service must 1) either establish a staging environment connected to other services in staging 2) or support multi-tenancy across request contexts and data storage) |
Debuggability, assuming the same modules, parameters, logs | ❌ Low | ✅ High (if distributed tracing is available) |
Latency | ✅ Low (Local) | ❌ High (Remote) |
DevOps costs | ✅ Low (High cost of build tools) | ❌ High (Capacity planning is difficult) |
Combining monolithic ==codebase== and microservices can leverage the strengths of both.
Scheduling Service
- Consistent hash addresses provided by geohash
- Data is transient in memory, so no need for duplication. (CAP: AP over CP)
- Use single-threaded or locked sharding to prevent double scheduling
Payment Service
==The key is to have asynchronous design==, as ACID transaction payment systems across multiple systems often have very long latencies.
- Utilize event queues
- Payment interface integrations with Braintree, PayPal, Card.io, Alipay, etc.
- Record all events through detailed logs
- Use APIs with idempotency, exponential backoff, and random jitter
User Profile Service and Trip Record Service
- Use caching to reduce latency
- As 1) support for more countries and regions increases 2) user roles (drivers, riders, restaurant owners, diners, etc.) gradually expand, providing user profile services for these users also faces significant challenges.
Notification Push Service
- Apple Push Notification Service (unreliable)
- Google Cloud Messaging (GCM) (can detect successful delivery) or
- SMS services are generally more reliable