Building a URL Shortener That Scales

URL shorteners like bit.ly and TinyURL handle billions of redirects per month. Behind their simple interface lies interesting engineering challenges around generating unique IDs, handling massive read traffic, and tracking analytics without impacting performance.

I’ve built URL shortening services for two different companies. This post covers the architecture decisions, implementation details, and production lessons I’ve learned.

Core Requirements

Functional

URL shortening: Convert long URLs to short codes
URL redirection: Redirect short codes to original URLs
Custom aliases: Allow users to choose their own short codes
Analytics: Track clicks, locations, devices, referrers
Expiration: Support TTL for temporary links
User management: Associate URLs with user accounts

Non-Functional

High availability: 99.99% uptime (52 minutes downtime per year)
Low latency: <100ms redirect time globally
High throughput: Handle 10,000+ redirects per second
Scalability: Support billions of URLs
Durability: Zero data loss

Capacity Planning

Traffic Estimation

Starting assumptions:

100 million URLs created per month
100:1 read-to-write ratio
Average URL lifetime: 5 years

Calculations:

Writes:
- 100M URLs/month = 40 URLs/second
- Peak (3x average): 120 URLs/second

Reads:
- 100:1 ratio = 10B redirects/month
- 4,000 redirects/second average
- Peak: 12,000 redirects/second

Storage Requirements

Per URL:
- Short code: 7 bytes
- Original URL: 500 bytes (average)
- Metadata: 100 bytes
- Total: ~600 bytes

Storage over 5 years:
- 100M × 12 × 5 = 6 billion URLs
- 6B × 600 bytes = 3.6 TB
- With 3x replication: ~11 TB

Cache:
- Cache top 20% hot URLs
- ~1.5 TB compressed

Generating Short Codes

This is the most interesting part. We need short codes that are:

Unique (no collisions)
Short (6-7 characters)
Unpredictable (for security)
Fast to generate

Base62 Encoding

Use 62 characters: [a-z, A-Z, 0-9]

BASE62 = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"

def base62_encode(num):
    if num == 0:
        return BASE62[0]

    result = []
    while num > 0:
        remainder = num % 62
        result.append(BASE62[remainder])
        num = num // 62

    return ''.join(reversed(result))

def base62_decode(string):
    num = 0
    for char in string:
        num = num * 62 + BASE62.index(char)
    return num

With 7 characters: 62^7 = 3.5 trillion possible codes

Approach 1: Hash-Based Generation

import hashlib

def generate_short_code_hash(long_url):
    """
    Generate short code by hashing URL
    """
    # Create hash
    hash_value = hashlib.md5(long_url.encode()).hexdigest()

    # Take first 7 characters
    short_code = base62_encode(int(hash_value[:8], 16))[:7]

    # Handle collisions
    attempt = 0
    while exists_in_database(short_code):
        # Append attempt counter and rehash
        new_input = f"{long_url}_{attempt}"
        hash_value = hashlib.md5(new_input.encode()).hexdigest()
        short_code = base62_encode(int(hash_value[:8], 16))[:7]
        attempt += 1

    return short_code

Pros:

Same URL always gets same short code (deduplication)
No coordination between servers needed

Cons:

Collision handling adds complexity
Extra database checks slow down writes

Approach 2: Counter-Based Generation

class CounterService:
    """
    Distributed counter using Redis
    """
    def __init__(self, redis_client):
        self.redis = redis_client
        self.counter_key = "url_counter"

    def get_next_id(self):
        # Atomically increment and get counter
        counter = self.redis.incr(self.counter_key)
        return counter

def generate_short_code_counter():
    counter = counter_service.get_next_id()
    short_code = base62_encode(counter)

    # Pad to 7 characters
    return short_code.rjust(7, '0')

Pros:

Zero collisions
Fast generation
Simple implementation

Cons:

Sequential codes (predictable)
Single point of failure (Redis)

Solution for SPOF: Pre-allocate ranges to each server

class RangeAllocator:
    def __init__(self, redis_client, range_size=10000):
        self.redis = redis_client
        self.range_size = range_size
        self.current = None
        self.max = None

    def get_next_id(self):
        if self.current is None or self.current >= self.max:
            self._allocate_range()

        self.current += 1
        return self.current

    def _allocate_range(self):
        # Atomically allocate a range
        start = self.redis.incrby("range_counter", self.range_size)
        self.current = start - self.range_size
        self.max = start

Each server pre-allocates 10,000 IDs. If server crashes, we lose at most 10,000 IDs (acceptable).

Approach 3: UUID-Based Generation

import uuid

def generate_short_code_uuid():
    """
    Generate short code from UUID
    """
    # Generate UUID
    unique_id = uuid.uuid4()

    # Convert to integer and encode
    num = unique_id.int
    short_code = base62_encode(num)[:7]

    # Check for collision (extremely rare with UUIDs)
    if exists_in_database(short_code):
        return generate_short_code_uuid()  # Regenerate

    return short_code

Pros:

No coordination needed
Unpredictable codes

Cons:

Still need collision checks (rare but possible)
Slightly slower than counter

My Recommendation

For production, I use counter-based with range allocation:

Fast and collision-free
Can add randomization to counter for unpredictability
Range allocation solves single point of failure

def generate_short_code():
    counter = range_allocator.get_next_id()

    # Add randomization
    randomized = (counter << 8) | random.randint(0, 255)

    return base62_encode(randomized)[:7]

Database Design

Schema

-- URLs table
CREATE TABLE urls (
    id BIGSERIAL PRIMARY KEY,
    short_code VARCHAR(10) UNIQUE NOT NULL,
    long_url TEXT NOT NULL,
    user_id BIGINT,
    created_at TIMESTAMP DEFAULT NOW(),
    expires_at TIMESTAMP,
    click_count BIGINT DEFAULT 0,
    is_active BOOLEAN DEFAULT TRUE,

    INDEX idx_short_code (short_code),
    INDEX idx_user_id (user_id),
    INDEX idx_created_at (created_at)
);

-- Analytics table
CREATE TABLE clicks (
    id BIGSERIAL PRIMARY KEY,
    short_code VARCHAR(10) NOT NULL,
    clicked_at TIMESTAMP DEFAULT NOW(),
    ip_address VARCHAR(45),
    user_agent TEXT,
    referrer TEXT,
    country_code VARCHAR(2),
    city VARCHAR(100),
    device_type VARCHAR(20),

    INDEX idx_short_code (short_code),
    INDEX idx_clicked_at (clicked_at)
);

-- Users table
CREATE TABLE users (
    id BIGSERIAL PRIMARY KEY,
    email VARCHAR(255) UNIQUE NOT NULL,
    api_key VARCHAR(64) UNIQUE,
    tier VARCHAR(20) DEFAULT 'free',
    created_at TIMESTAMP DEFAULT NOW(),

    INDEX idx_email (email),
    INDEX idx_api_key (api_key)
);

Database Choice

PostgreSQL for URLs table:

ACID guarantees
Excellent indexing
Can handle 10K+ writes/second on good hardware
Easy to shard when needed

Time-series database for analytics:

ClickHouse or TimescaleDB
Optimized for append-only workloads
Better compression for historical data
Fast aggregation queries

Sharding Strategy

When a single database can’t handle the load:

def get_shard(short_code):
    """
    Hash-based sharding
    """
    shard_count = 16
    shard_id = hash(short_code) % shard_count
    return f"shard_{shard_id}"

This distributes URLs evenly across shards.

System Architecture

                        Users
                          |
                          ▼
                   ┌─────────────┐
                   │     CDN     │
                   │ (CloudFlare)│
                   └──────┬──────┘
                          │
                          ▼
                  ┌───────────────┐
                  │ Load Balancer │
                  └───────┬───────┘
                          │
              ┌───────────┼───────────┐
              │           │           │
         ┌────▼────┐ ┌────▼────┐ ┌────▼────┐
         │   App   │ │   App   │ │   App   │
         │ Server  │ │ Server  │ │ Server  │
         └────┬────┘ └────┬────┘ └────┬────┘
              │           │           │
              └───────────┼───────────┘
                          │
          ┌───────────────┼───────────────┐
          │               │               │
     ┌────▼────┐   ┌──────▼──────┐  ┌────▼─────┐
     │  Redis  │   │  PostgreSQL │  │  Kafka   │
     │  Cache  │   │   Cluster   │  │  Queue   │
     └─────────┘   └─────────────┘  └──────────┘
                                          │
                                     ┌────▼────┐
                                     │Analytics│
                                     │ Workers │
                                     └─────────┘

Request Flow

URL Creation:

Client sends POST /api/urls with long URL
App server generates short code
Check if short code exists in Redis cache
If not in cache, check database
Insert into database
Cache in Redis with 24-hour TTL
Return short code to client

URL Redirection:

Client accesses short URL (GET /:shortCode)
Check Redis cache first
If cache hit:
- Immediately redirect (302)
- Fire analytics event to Kafka asynchronously
If cache miss:
- Query database
- Cache result in Redis
- Redirect and fire analytics event
Analytics workers consume Kafka events and batch write to analytics DB

Caching Strategy

Redis Cache Configuration

# Cache configuration
CACHE_TTL = 86400  # 24 hours
CACHE_SIZE = "100GB"  # Can hold ~150M URLs

# Cache structure
cache_key = f"url:{short_code}"
cache_value = {
    'long_url': 'https://example.com/...',
    'expires_at': 1735689600,
    'user_id': 12345
}

Cache Eviction

Use LRU (Least Recently Used):

Hot URLs stay in cache
Cold URLs get evicted automatically
Cache hit rate: 85-90%

Cache Warming

Pre-populate cache with trending URLs:

def warm_cache():
    """
    Cache top URLs from last 24 hours
    """
    top_urls = db.query("""
        SELECT short_code, long_url, COUNT(*) as clicks
        FROM clicks
        WHERE clicked_at > NOW() - INTERVAL '24 hours'
        GROUP BY short_code, long_url
        ORDER BY clicks DESC
        LIMIT 100000
    """)

    for url in top_urls:
        redis.setex(
            f"url:{url.short_code}",
            CACHE_TTL,
            url.long_url
        )

Analytics Pipeline

Asynchronous Processing

Never block redirects for analytics:

async def handle_redirect(short_code):
    # Get URL from cache/DB
    long_url = await get_url(short_code)

    if not long_url:
        return HTTP_404

    # Fire analytics event (non-blocking)
    asyncio.create_task(track_click(short_code, request))

    # Immediately redirect
    return redirect(long_url, status_code=302)

async def track_click(short_code, request):
    event = {
        'short_code': short_code,
        'timestamp': time.time(),
        'ip': request.client.host,
        'user_agent': request.headers.get('user-agent'),
        'referrer': request.headers.get('referer'),
    }

    # Send to Kafka
    await kafka_producer.send('clicks', value=event)

Batch Processing

Analytics workers batch write to database:

class AnalyticsWorker:
    def __init__(self, batch_size=1000, flush_interval=5):
        self.batch = []
        self.batch_size = batch_size
        self.flush_interval = flush_interval

    async def process_events(self):
        async for event in kafka_consumer:
            self.batch.append(event)

            if len(self.batch) >= self.batch_size:
                await self.flush()

    async def flush(self):
        if not self.batch:
            return

        # Batch insert
        await db.execute_many(
            "INSERT INTO clicks VALUES (...)",
            self.batch
        )

        self.batch = []

This reduces database load by 1000x.

Advanced Features

Custom Domains

Allow users to use their own domains:

# Domain mapping table
CREATE TABLE custom_domains (
    domain VARCHAR(255) PRIMARY KEY,
    user_id BIGINT NOT NULL,
    verified BOOLEAN DEFAULT FALSE,
    created_at TIMESTAMP DEFAULT NOW()
);

def get_url_by_domain(domain, short_code):
    # Check custom domain
    custom = db.query("""
        SELECT u.long_url
        FROM urls u
        JOIN custom_domains d ON u.user_id = d.user_id
        WHERE d.domain = %s AND u.short_code = %s
    """, domain, short_code)

    return custom.long_url if custom else None

Link Expiration

Automatically expire old links:

# Cron job runs every hour
def expire_old_urls():
    db.execute("""
        UPDATE urls
        SET is_active = FALSE
        WHERE expires_at < NOW()
        AND is_active = TRUE
    """)

    # Clear from cache
    expired = db.query("""
        SELECT short_code
        FROM urls
        WHERE expires_at < NOW()
        AND is_active = TRUE
    """)

    for url in expired:
        redis.delete(f"url:{url.short_code}")

QR Code Generation

Generate QR codes for short URLs:

import qrcode

def generate_qr_code(short_url):
    qr = qrcode.QRCode(version=1, box_size=10, border=5)
    qr.add_data(short_url)
    qr.make(fit=True)

    img = qr.make_image(fill_color="black", back_color="white")
    return img

Security Considerations

Prevent Abuse

Rate limiting: Limit URL creation per IP/user
URL validation: Block malicious URLs (phishing, malware)
Custom alias restrictions: Reserve keywords, prevent profanity
Link scanning: Integrate with Google Safe Browsing API

RESERVED_CODES = {'admin', 'api', 'www', 'app', 'help'}

def is_valid_custom_alias(alias):
    if alias in RESERVED_CODES:
        return False

    if len(alias) < 4 or len(alias) > 20:
        return False

    if not re.match(r'^[a-zA-Z0-9-_]+$', alias):
        return False

    return True

URL Validation

import requests

def is_safe_url(url):
    # Check against Google Safe Browsing
    response = requests.post(
        'https://safebrowsing.googleapis.com/v4/threatMatches:find',
        params={'key': GOOGLE_API_KEY},
        json={
            'threatInfo': {
                'threatTypes': ['MALWARE', 'SOCIAL_ENGINEERING'],
                'platformTypes': ['ANY_PLATFORM'],
                'threatEntries': [{'url': url}]
            }
        }
    )

    return len(response.json().get('matches', [])) == 0

Monitoring

Key metrics to track:

# Performance metrics
- redirect_latency_p50, p95, p99
- url_creation_latency
- cache_hit_rate
- database_query_time

# Business metrics
- urls_created_per_minute
- redirects_per_minute
- top_referrers
- geographic_distribution
- device_breakdown

# System health
- redis_memory_usage
- database_connection_pool
- kafka_lag
- error_rate_4xx, 5xx

Production Lessons

Cache everything: 90% of traffic hits 10% of URLs
Fail open: If analytics fails, don’t block redirects
Pre-allocate IDs: Avoid single point of failure
Use CDN: Reduce latency globally with edge caching
Batch writes: Analytics can be eventual consistent
Monitor abuse: Bad actors will try to exploit your service
Plan for scale: Database sharding is easier to plan upfront

Conclusion

Building a production URL shortener involves more than generating short codes. The challenge is handling massive traffic, maintaining low latency globally, and tracking analytics without impacting performance.

The architecture I’ve outlined handles billions of URLs and tens of thousands of requests per second. Start simple with a single database and Redis cache, then scale horizontally as traffic grows.

Reach out on Twitter or LinkedIn if you want to discuss URL shortener implementations or system design patterns.

Building a URL Shortener That Scales

Core Requirements

Functional

Non-Functional

Capacity Planning

Traffic Estimation

Storage Requirements

Generating Short Codes

Base62 Encoding

Approach 1: Hash-Based Generation

Approach 2: Counter-Based Generation

Approach 3: UUID-Based Generation

My Recommendation

Database Design

Schema

Database Choice

Sharding Strategy

System Architecture

Request Flow

Caching Strategy

Redis Cache Configuration

Cache Eviction

Cache Warming

Analytics Pipeline

Asynchronous Processing

Batch Processing

Advanced Features

Custom Domains

Link Expiration

QR Code Generation

Security Considerations

Prevent Abuse

URL Validation

Monitoring

Production Lessons

Conclusion

Related Posts

Designing a Distributed Rate Limiter

How I Approach System Design Interviews: A Framework That Actually Works

Let's Connect! 💬