Case Study: Twitter / X
Newsfeed generation — system design-এর crown jewel।
Elon Musk একটি tweet করলেন। ১৫০ million followers এর timeline-এ সেটি কীভাবে দ্রুত পৌঁছায়? অথচ নতুন user-এর কাছে timeline 200ms-এ load হয়? Twitter-এর architecture system design interview-এর সবচেয়ে নাটকীয় case study।
Step 1: Requirements
Functional
- Tweet post (280 chars + media)।
- Follow/unfollow।
- Home timeline (followee-দের tweet)।
- User timeline (নিজের tweet)।
- Search / hashtag।
- Retweet, like, reply।
Non-Functional
- Read-heavy (100:1)।
- Timeline load <200ms।
- Eventually consistent OK (1-2 sec)।
- High availability।
Step 2: Capacity Estimation
DAU: 250M
Tweets/day: 500M (avg 2/user)
Tweets/sec: 500M / 86400 ≈ 5,800 writes/sec
Read QPS: 5,800 × 100 = 580K timeline reads/sec
Per tweet: ~1KB
Daily tweet storage: 500GB
Timeline cache (per user 800 tweets): 1KB × 800 × 250M = 200TB
Step 3: API Design
POST /tweet { text, mediaUrls? }
GET /timeline?cursor=X&limit=20
POST /follow { userId }
GET /user/:id/tweets
GET /search?q=...
Step 4: Data Model
User: { id, name, handle, follower_count, ... }
Tweet: {
id (Snowflake), user_id, text,
media_urls[], created_at,
reply_to, retweet_of
}
Follow: { follower_id, followee_id, ts }
Timeline (cache): { user_id, tweet_ids[] (latest 800) }
Engagement: { tweet_id, likes, retweets, replies }
The Big Question: Timeline Generation
Three approaches:
Approach 1: Pull (Read-time fan-out)
User timeline দেখার সময় — সব followee-এর recent tweet fetch + sort।
- Pros: Storage কম, write fast।
- Cons: Read slow — N user-এর data fetch।
- Use case: Inactive user, low-follow user।
Approach 2: Push (Write-time fan-out)
Tweet post করার সময় — সব followers-এর timeline cache-এ inject।
- Pros: Read super fast — pre-computed।
- Cons: Write expensive — celebrity তে disaster।
- Celebrity problem: Elon Musk-এর tweet → 150M timeline write।
Approach 3: Hybrid (Twitter's Choice)
Most users push; celebrities pull।
- User < 1M followers → push।
- Celebrity (1M+) → pull at read time।
- User-এর timeline = pre-computed timeline + celebrity-দের live fetch + merge।
Step 5: Architecture
[Client]
↓
[CDN] [LB]
↓
[API Gateway]
↓ ↓ ↓
[Tweet Service] [Timeline Service] [User Service]
↓ ↓ ↓
[Kafka] [Redis Timeline Cache] [User DB]
↓
[Fan-out Workers] → [Followers' Timelines]
↓
[Tweet Storage] (Cassandra/Manhattan)
[Search Index] (Elasticsearch)
Step 6: Components
Tweet Storage
- Cassandra/Manhattan (Twitter-এর internal)।
- Sharded by user_id।
- Snowflake ID — time-ordered।
Timeline Cache (Redis)
- Per-user latest 800 tweet IDs।
- Sorted by time।
- Eviction: inactive user (3 days no login)।
Fan-out Service
- New tweet → Kafka event।
- Worker tweet-এর author-এর followers fetch।
- Each follower-এর timeline cache-এ tweet ID prepend।
- Celebrity skip — runtime-এ merge।
Search
- Tweet → Elasticsearch index।
- Hashtag, full-text search।
Celebrity Problem in Detail
Elon Musk tweet posts:
Pure Push
- 150M timeline write
- Massive backend load
- Slow follower delivery
- Storage explosion
Hybrid Approach
- Celebrity tweet → no fan-out
- Followers' read-time merge
- Cache hit at celebrity level
- Manageable cost
Trending Topics
- Stream processing (Storm/Heron)।
- Hashtag count over sliding window।
- Real-time + decay function।
- Geographic trending।
Scale Considerations
Read Path
- Multi-tier cache (CDN → Redis → DB)।
- Connection pooling।
- Pagination।
Write Path
- Async fan-out via Kafka।
- Eventual consistency 1-2 sec OK।
Storage
- Tweet sharded by user_id।
- Old tweets archive (cold storage)।
Real World
- ২৫০M+ DAU।
- ৫০০M+ tweets/day।
- Manhattan — Twitter-এর internal distributed DB।
- Heron — stream processing।
- Mesos — container orchestration।
Trade-offs
- Push: write-heavy, fast read, celebrity disaster।
- Pull: read-heavy, scalable write, slow read for active user।
- Hybrid: complexity but right for both।
- Eventually consistent: tweet 1-2 sec late = fine।
Engineering Lessons
- Hybrid approach often best for skewed distributions।
- Pre-computation trades storage for speed।
- Eventually consistent OK for social।
- Identify edge cases (celebrity)।
- Multi-tier caching essential।
📌 চ্যাপ্টার সারমর্ম
- Twitter timeline = read-heavy (100:1)।
- Pull, Push, Hybrid — three strategies।
- Hybrid: normal user push, celebrity pull।
- Redis timeline cache + Cassandra storage।
- Async fan-out via Kafka।