Part 5 · Case Studies 📖 ১৫ মিনিট পড়া 📝 ২০টি কুইজ

Case Study: Twitter / X

Newsfeed generation — system design-এর crown jewel।

Elon Musk একটি tweet করলেন। ১৫০ million followers এর timeline-এ সেটি কীভাবে দ্রুত পৌঁছায়? অথচ নতুন user-এর কাছে timeline 200ms-এ load হয়? Twitter-এর architecture system design interview-এর সবচেয়ে নাটকীয় case study।

Step 1: Requirements

Functional

Tweet post (280 chars + media)।
Follow/unfollow।
Home timeline (followee-দের tweet)।
User timeline (নিজের tweet)।
Search / hashtag।
Retweet, like, reply।

Non-Functional

Read-heavy (100:1)।
Timeline load <200ms।
Eventually consistent OK (1-2 sec)।
High availability।

Step 2: Capacity Estimation

DAU: 250M Tweets/day: 500M (avg 2/user) Tweets/sec: 500M / 86400 ≈ 5,800 writes/sec Read QPS: 5,800 × 100 = 580K timeline reads/sec Per tweet: ~1KB Daily tweet storage: 500GB Timeline cache (per user 800 tweets): 1KB × 800 × 250M = 200TB

Step 3: API Design

POST /tweet { text, mediaUrls? } GET /timeline?cursor=X&limit=20 POST /follow { userId } GET /user/:id/tweets GET /search?q=...

Step 4: Data Model

User: { id, name, handle, follower_count, ... } Tweet: { id (Snowflake), user_id, text, media_urls[], created_at, reply_to, retweet_of } Follow: { follower_id, followee_id, ts } Timeline (cache): { user_id, tweet_ids[] (latest 800) } Engagement: { tweet_id, likes, retweets, replies }

The Big Question: Timeline Generation

Three approaches:

Approach 1: Pull (Read-time fan-out)

User timeline দেখার সময় — সব followee-এর recent tweet fetch + sort।

Pros: Storage কম, write fast।
Cons: Read slow — N user-এর data fetch।
Use case: Inactive user, low-follow user।

Approach 2: Push (Write-time fan-out)

Tweet post করার সময় — সব followers-এর timeline cache-এ inject।

Pros: Read super fast — pre-computed।
Cons: Write expensive — celebrity তে disaster।
Celebrity problem: Elon Musk-এর tweet → 150M timeline write।

Approach 3: Hybrid (Twitter's Choice)

Most users push; celebrities pull।

User < 1M followers → push।
Celebrity (1M+) → pull at read time।
User-এর timeline = pre-computed timeline + celebrity-দের live fetch + merge।

Step 5: Architecture

[Client] ↓ [CDN] [LB] ↓ [API Gateway] ↓ ↓ ↓ [Tweet Service] [Timeline Service] [User Service] ↓ ↓ ↓ [Kafka] [Redis Timeline Cache] [User DB] ↓ [Fan-out Workers] → [Followers' Timelines] ↓ [Tweet Storage] (Cassandra/Manhattan) [Search Index] (Elasticsearch)

Step 6: Components

Tweet Storage

Cassandra/Manhattan (Twitter-এর internal)।
Sharded by user_id।
Snowflake ID — time-ordered।

Timeline Cache (Redis)

Per-user latest 800 tweet IDs।
Sorted by time।
Eviction: inactive user (3 days no login)।

Fan-out Service

New tweet → Kafka event।
Worker tweet-এর author-এর followers fetch।
Each follower-এর timeline cache-এ tweet ID prepend।
Celebrity skip — runtime-এ merge।

Search

Tweet → Elasticsearch index।
Hashtag, full-text search।

Celebrity Problem in Detail

Elon Musk tweet posts:

Pure Push

150M timeline write
Massive backend load
Slow follower delivery
Storage explosion

Hybrid Approach

Celebrity tweet → no fan-out
Followers' read-time merge
Cache hit at celebrity level
Manageable cost

Stream processing (Storm/Heron)।
Hashtag count over sliding window।
Real-time + decay function।
Geographic trending।

Scale Considerations

Read Path

Multi-tier cache (CDN → Redis → DB)।
Connection pooling।
Pagination।

Write Path

Async fan-out via Kafka।
Eventual consistency 1-2 sec OK।

Storage

Tweet sharded by user_id।
Old tweets archive (cold storage)।

Real World

২৫০M+ DAU।
৫০০M+ tweets/day।
Manhattan — Twitter-এর internal distributed DB।
Heron — stream processing।
Mesos — container orchestration।

Trade-offs

Push: write-heavy, fast read, celebrity disaster।
Pull: read-heavy, scalable write, slow read for active user।
Hybrid: complexity but right for both।
Eventually consistent: tweet 1-2 sec late = fine।

Engineering Lessons

Hybrid approach often best for skewed distributions।
Pre-computation trades storage for speed।
Eventually consistent OK for social।
Identify edge cases (celebrity)।
Multi-tier caching essential।

📌 চ্যাপ্টার সারমর্ম

Twitter timeline = read-heavy (100:1)।
Pull, Push, Hybrid — three strategies।
Hybrid: normal user push, celebrity pull।
Redis timeline cache + Cassandra storage।
Async fan-out via Kafka।