How I Approach System Design Interviews: A Framework That Actually Works
System design interviews are challenging. Unlike coding interviews where you write actual code, you’re expected to architect entire systems while explaining your thinking out loud. The questions are deliberately open-ended, which makes it hard to know where to start.
This framework helped me navigate these interviews systematically. I’ll demonstrate it by designing a URL shortener (like bit.ly), walking through each step with concrete examples you can apply to any system design problem.
Why URL Shortener as an Example?
I picked URL shorteners because they’re deceptively simple. Everyone has used short links, but building one involves interesting technical challenges around distributed systems, caching, and database design. Interviewers frequently use this problem because it covers core concepts without being overwhelming.
The Framework: RUDE-CAT
The name is easy to remember. Here’s what it stands for:
- Requirements - What are we building?
- Usage Estimates - Expected traffic and scale
- Data Model & API Design - Interface design
- Entity Relationships & Storage - Data storage strategy
- Component Design - System architecture
- Advanced Topics - Performance and reliability
- Trade-offs & Bottlenecks - What can fail
Step 1: Requirements (~15% of interview time)
Never jump straight into designing. I’ve made this mistake before. Always start by clarifying requirements—it prevents you from solving the wrong problem.
Functional Requirements
For a URL shortener, I clarify these features:
- URL shortening: users provide a long URL and receive a short code
- Redirect: short URLs redirect to the original long URL
- Custom aliases: can users choose their own short codes?
- Analytics: do we track click counts, locations, referrers?
- Expiration: should links have a TTL?
Start with core features (1-2), then discuss optional ones based on remaining time.
Non-Functional Requirements
For a URL shortener, I prioritize:
- High availability (99.9% uptime) - dead links damage user trust
- Low latency (under 100ms for redirects) - users expect instant redirects
- Scalability - must handle millions of URLs and billions of clicks
- Durability - links shouldn’t disappear once created
For this system, availability and speed matter more than perfect consistency. Analytics can tolerate slight delays, but slow or failed redirects are unacceptable.
Step 2: Usage Estimates (~10% of interview time)
Time for capacity planning. Rough estimates are fine—interviewers care about your approach, not precise arithmetic.
Scale Assumptions
Starting assumptions:
- 100 million new URLs created per month
- 100:1 read-to-write ratio (way more redirects than new URLs)
- URLs are stored forever (we can discuss deletion later)
Quick calculations:
New URLs (writes):
- 100M URLs/month
- That's roughly 40 URLs/second
- At peak: maybe 120 URLs/second (I usually assume 3x)
Redirects (reads):
- 100:1 ratio means 10 billion redirects/month
- That's about 4,000 redirects/second
- Peak: around 12,000 redirects/second
How Much Storage Do We Need?
Each URL entry needs:
- Short code: 7 bytes
- Original URL: ~500 bytes (being generous)
- Metadata: ~100 bytes (timestamps, user info, etc.)
- Total: roughly 600 bytes per URL
Storage over time:
- Year 1: 100M × 12 months × 600 bytes = 720 GB
- Year 5: about 3.6 TB
- With 3x replication: ~11 TB total
Cache size (80/20 rule - 20% of URLs drive 80% of traffic):
- Cache top 20%: ~200 GB per year
Round liberally and use powers of 10. The goal is demonstrating that you think about scale systematically.
Step 3: Data Model & API Design (~15% of interview time)
I keep the API design simple and RESTful.
API Design
1. Create Short URL
POST /api/v1/urls
{
"long_url": "https://example.com/very/long/path",
"custom_alias": "my-link", // optional
"expiry_date": "2026-12-31" // optional
}
Response:
{
"short_url": "https://short.ly/abc123",
"long_url": "https://example.com/very/long/path",
"created_at": "2026-01-06T10:00:00Z"
}
2. Redirect
GET /:shortCode
→ Returns 302 redirect to long URL
3. Analytics
GET /api/v1/urls/:shortCode/stats
{
"short_url": "https://short.ly/abc123",
"total_clicks": 1523,
"clicks_by_date": [...],
"top_locations": [...]
}
4. Delete URL
DELETE /api/v1/urls/:shortCode
Database Schema
-- URL mappings
CREATE TABLE urls (
id BIGSERIAL PRIMARY KEY,
short_code VARCHAR(10) UNIQUE NOT NULL,
long_url TEXT NOT NULL,
user_id BIGINT,
created_at TIMESTAMP DEFAULT NOW(),
expires_at TIMESTAMP,
is_active BOOLEAN DEFAULT TRUE,
INDEX idx_short_code (short_code),
INDEX idx_user_id (user_id)
);
-- Analytics tracking
CREATE TABLE clicks (
id BIGSERIAL PRIMARY KEY,
short_code VARCHAR(10) NOT NULL,
clicked_at TIMESTAMP DEFAULT NOW(),
ip_address VARCHAR(45),
user_agent TEXT,
referrer TEXT,
country VARCHAR(2),
INDEX idx_short_code (short_code),
INDEX idx_clicked_at (clicked_at)
);
Design rationale:
short_codeVARCHAR(10) provides 62^10 possible combinations- Separate
clickstable prevents analytics writes from blocking redirects - Index on
short_codeenables fast lookups
Step 4: Database Selection (~10% of interview time)
Choosing the Database
Start with PostgreSQL:
- ACID guarantees prevent data loss
- Excellent indexing for fast lookups
- Supports joins for analytics queries
- Proven reliability at scale
Consider NoSQL at 10K+ writes/second:
- Cassandra or DynamoDB
- Better horizontal scaling
- Eventually consistent model works for this use case
- Optimized for key-value lookups
Start simple with Postgres. Migration to NoSQL is straightforward if needed later.
Sharding Strategy
When a single database reaches capacity:
Option 1: Hash-based sharding
- shard_id = hash(short_code) % num_shards
- ✅ Even distribution
- ❌ Resharding is complex
Option 2: Range-based sharding
- Shard by creation date
- ✅ Easy to add shards
- ❌ Write hotspots on newest shard
Option 3: Geographic sharding
- Shard by user location
- ✅ Data locality
- ❌ Uneven distribution
Hash-based sharding on short_code provides the most even distribution.
Step 5: Component Design (~30% of interview time)
This is where you design the architecture and explain how components interact.
High-Level Architecture
Users
|
↓
┌─────────────────┐
│ Load Balancer │
│ (CloudFlare) │
└────────┬────────┘
|
┌────────────┼────────────┐
↓ ↓ ↓
┌────────┐ ┌────────┐ ┌────────┐
│ Web │ │ Web │ │ Web │
│ Server │ │ Server │ │ Server │
└───┬────┘ └───┬────┘ └───┬────┘
└───────────┼───────────┘
↓
┌───────────────┐
│ App Servers │
│ (API Layer) │
└───────┬───────┘
|
┌───────────┼───────────┐
↓ ↓ ↓
┌────────┐ ┌─────────┐ ┌──────────┐
│ Redis │ │Postgres │ │ Analytics│
│ Cache │ │ DB │ │ Queue │
└────────┘ └─────────┘ └──────────┘
Component Responsibilities
Load Balancer:
- Distributes traffic across web servers
- Health checking
- SSL termination
- DDoS protection
Web Servers (stateless):
- Handle HTTP requests
- Route to API endpoints
- Serve static content
App Servers:
- Generate short codes
- Validate URLs
- Database operations
- Business logic
Redis Cache:
- Cache frequently accessed URLs
- 24-hour TTL
- Reduces database load by 80%+
PostgreSQL:
- Persistent storage
- Master handles writes
- Read replicas for queries
Analytics Queue:
- Asynchronous click processing
- Batch writes to analytics database
- Prevents blocking redirects
Step 6: Advanced Topics (~15% of interview time)
Deep dive into critical implementation details.
Short Code Generation
Three approaches to consider:
Approach 1: Hashing
def generate_short_code(long_url):
# Hash the URL
hash_value = md5(long_url).hexdigest()
# Take first 7 characters, convert to base62
short_code = base62_encode(hash_value[:7])
# Handle collisions if needed
if exists_in_db(short_code):
short_code = handle_collision(short_code)
return short_code
✅ Same URL always gets same short code ❌ Requires collision handling
Approach 2: Counter-based
def generate_short_code():
# Get next counter from distributed service
counter = get_next_counter()
# Convert to base62
short_code = base62_encode(counter)
return short_code
✅ No collisions ❌ Predictable sequence
Approach 3: Random
def generate_short_code():
while True:
short_code = random_base62(7)
if not exists_in_db(short_code):
return short_code
✅ Unpredictable and simple ❌ Multiple database checks possible
Counter-based generation with ZooKeeper or Redis for distributed counter management provides the best balance.
Caching Strategy
Implementation:
When a redirect request comes in:
1. Check Redis first (key: short_code, value: long_url)
2. Cache HIT?
- Return the URL immediately
- Log analytics asynchronously
3. Cache MISS?
- Query the database
- Store in Redis with 24-hour TTL
- Return the URL
Eviction: LRU (kick out least recently used)
Size: ~200 GB (stores about 300M URLs)
Expected hit rate: ~85%
This reduces database load significantly—most requests never touch the database.
Analytics Processing
Asynchronous analytics to avoid blocking redirects:
Flow:
1. User visits short URL
2. Immediately redirect
3. Fire event to message queue (Kafka/RabbitMQ)
4. Worker processes batch events
5. Batch insert to analytics DB
Benefits:
- Redirect latency: <50ms
- Event aggregation before writes
- Replay capability if analytics DB fails
Rate Limiting
Using Redis with token bucket algorithm:
if redis.get(ip + ':minute') > 100:
return 429 Too Many Requests
else:
redis.incr(ip + ':minute')
redis.expire(ip + ':minute', 60)
Step 7: Trade-offs & Bottlenecks (~5% of interview time)
Discuss potential failure modes and design trade-offs.
Potential Bottlenecks
-
Database writes at massive scale → Shard database or migrate to NoSQL
-
Counter service single point of failure → Pre-allocate counter ranges to each server
-
Hot URLs overwhelming cache → Add CDN layer with multiple cache replicas
-
High latency for global users → Multi-region deployment with geo-routing
Design Trade-offs
| Decision | Trade-off |
|---|---|
| Eventually consistent analytics | Fast redirects vs. real-time accuracy |
| 7-character short codes | 3.5 trillion URLs vs. shorter codes |
| PostgreSQL over NoSQL | Simplicity vs. extreme write throughput |
| Cache-aside pattern | Some cache misses vs. complexity |
Common Mistakes to Avoid
- Jumping to solutions without clarifying requirements
- Over-engineering for unnecessary scale
- Ignoring failure modes and recovery strategies
- Skipping capacity estimates
- Not explaining trade-offs behind decisions
- Drawing boxes without explaining their purpose
Time Management
For a 45-minute interview:
- 0-7 min: Requirements & clarifications
- 7-12 min: Capacity estimates
- 12-20 min: API design & data model
- 20-35 min: Architecture & components
- 35-40 min: Deep dive (caching, encoding, scaling)
- 40-45 min: Trade-offs & Q&A
Reserve time for interviewer questions at the end.
Applying RUDE-CAT to Other Problems
The framework adapts to any system design problem:
- Instagram: Image storage, feed generation, content delivery
- Netflix: Video encoding, CDN strategy, recommendation systems
- Uber: Geo-spatial indexing, driver matching, real-time updates
- WhatsApp: WebSocket connections, message queues, end-to-end encryption
The steps remain constant. Only the implementation details change.
Final Thoughts
System design interviews don’t have perfect answers. Interviewers evaluate your thinking process, how you handle trade-offs, and your ability to communicate complex technical concepts clearly.
This framework helps me stay organized under pressure. It provides structure when the problem feels overwhelming.
Practice this framework on various problems. Draw diagrams. Explain your reasoning out loud. Time yourself. With repetition, the process becomes natural.
Reach out on Twitter or LinkedIn if you want to discuss system design or interview preparation.
Resources
Books:
- “Designing Data-Intensive Applications” by Martin Kleppmann
- “System Design Interview” by Alex Xu
Online Resources:
Practice Platforms:
- LeetCode System Design
- Pramp - Mock interviews with peers
Related Posts
Building a URL Shortener That Scales
Designing and implementing a production-ready URL shortener with custom domains, analytics, and high availability.
Designing a Distributed Rate Limiter
Building a production-ready rate limiter that scales across multiple servers using Redis and token bucket algorithm.