System Design: URL Shortener from Scratch
Deep dive into URL shortener architecture. Learn hash function design, redirect logic, data storage, rate limiting, and high-availability.
System Design: URL Shortener from Scratch
URL shorteners are deceptively simple systems. The core functionality just converts long URLs to short ones and redirects users when they visit the short URL. But building one that handles millions of users with low latency and high availability reveals interesting challenges.
This case study walks through designing a URL shortener like bit.ly or TinyURL.
Introduction
Functional Requirements
Users need to:
- Shorten a long URL into a compact link
- Access the short URL and get redirected to the original
- Optionally set expiration dates
- Optionally customize the short code
- View statistics on link usage
Non-Functional Requirements
The system must be:
- Fast: Redirects under 100ms
- Available: Handle service disruptions gracefully
- Scalable: Millions of links, billions of redirects
- Durable: No lost links
Capacity Estimation
Daily active users: 100 million URL creations Redirect ratio: 100:1 (one creation, one hundred redirects)
Storage needed over 5 years:
- 100M links/day 365 days 5 years = 182.5 billion links
- At 500 bytes per link: ~91 TB
Redirect QPS: 100M * 100 / 86400 = ~115,000 QPS
Core Components
Short Code Generation
The short code is the heart of the system. It needs to be:
- Unique
- Random enough to be unpredictable
- Short (6-8 characters typical)
- URL-safe (alphanumeric)
Hash Function Approaches
Three approaches generate short codes:
Approach 1: MD5/SHA hash of long URL + salt
import hashlib
import base62
def generate_short_code(url: str, salt: str = "mysalt") -> str:
hash_input = f"{url}:{salt}"
md5_hash = hashlib.md5(hash_input.encode()).hexdigest()
# Take first 8 characters and encode in base62
return base62.encode(int(md5_hash[:8], 16))[:8]
# Example
short = generate_short_code("https://example.com/very/long/url/path")
# Result: "xV2bP9qK"
Problem: Same URL always produces same hash, enabling URL enumeration attacks.
Approach 2: Hash + counter for uniqueness
import hashlib
import base62
import time
def generate_short_code(url: str, salt: str = "mysalt") -> str:
# Combine URL with timestamp and random to ensure uniqueness
combined = f"{url}:{time.time_ns()}:{random.randint(0, 999999)}"
md5_hash = hashlib.md5(combined.encode()).hexdigest()
return base62.encode(int(md5_hash[:8], 16))[:8]
Approach 3: Counter-based (KGS approach)
Use a Key Generation Service that pre-generates short codes:
class KeyGenerationService:
def __init__(self, batch_size=1000):
self.batch_size = batch_size
self.available_keys = []
def get_next_key(self) -> str:
if not self.available_keys:
self._refill_batch()
return self.available_keys.pop()
def _refill_batch(self):
# Generate batch from counter
start = self._get_current_counter()
for i in range(start, start + self.batch_size):
self.available_keys.append(base62.encode(i))
self._increment_counter(start + self.batch_size)
The counter approach guarantees uniqueness and allows easy key management.
Base62 Encoding
Base62 uses characters 0-9, A-Z, a-z giving 62 characters per position:
| Length | Possible Combinations | Equivalent URLs |
|---|---|---|
| 6 | 62^6 = 56.8 billion | Enough for all links |
| 7 | 62^7 = 3.5 trillion | Generous headroom |
| 8 | 62^8 = 218 trillion | Future-proof |
Data Model
Relational Schema
CREATE TABLE urls (
id BIGSERIAL PRIMARY KEY,
short_code VARCHAR(12) NOT NULL UNIQUE,
original_url TEXT NOT NULL,
created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
expires_at TIMESTAMP WITH TIME ZONE,
is_custom BOOLEAN DEFAULT FALSE,
creator_id BIGINT,
click_count BIGINT DEFAULT 0,
is_active BOOLEAN DEFAULT TRUE,
CONSTRAINT urls_short_code_idx UNIQUE (short_code)
);
CREATE INDEX idx_urls_short_code ON urls(short_code);
CREATE INDEX idx_urls_creator ON urls(creator_id);
CREATE INDEX idx_urls_expires ON urls(expires_at) WHERE expires_at IS NOT NULL;
NoSQL Alternative (DynamoDB)
{
"TableName": "urls",
"KeySchema": [{ "AttributeName": "short_code", "KeyType": "HASH" }],
"AttributeDefinitions": [
{ "AttributeName": "short_code", "AttributeType": "S" },
{ "AttributeName": "creator_id", "AttributeType": "N" }
],
"GlobalSecondaryIndexes": [
{
"IndexName": "creator-index",
"KeySchema": [{ "AttributeName": "creator_id", "KeyType": "HASH" }],
"Projection": { "ProjectionType": "ALL" },
"ProvisionedThroughput": {
"ReadCapacityUnits": 100,
"WriteCapacityUnits": 50
}
}
],
"ProvisionedThroughput": {
"ReadCapacityUnits": 1000,
"WriteCapacityUnits": 500
}
}
Caching Strategy
Redirect latency is critical. Cache aggressively.
Cache Structure
# Redis cache key pattern
cache_key = f"url:{short_code}"
# Cache value
cache_value = {
"original_url": "https://example.com/very/long/path",
"expires_at": "2027-01-01T00:00:00Z",
"is_active": True
}
Cache TTL Strategy
CACHE_TTL = {
"frequently_accessed": 3600, # 1 hour for popular links
"recently_created": 300, # 5 minutes for new links
"custom": 86400, # 24 hours for custom links
"expired": 60 # 1 minute for recently expired
}
Write-Through Cache
async def create_short_url(url: str, custom_code: str = None) -> str:
short_code = custom_code or generate_short_code(url)
# Write to database
await db.urls.create({
"short_code": short_code,
"original_url": url,
"is_custom": custom_code is not None
})
# Write to cache
await cache.set(f"url:{short_code}", {
"original_url": url,
"expires_at": None,
"is_active": True
}, ttl=CACHE_TTL["recently_created"])
return short_code
Cache Miss Handling
async def get_original_url(short_code: str) -> Optional[str]:
# Check cache first
cached = await cache.get(f"url:{short_code}")
if cached:
return cached["original_url"]
# Cache miss - fetch from database
url_record = await db.urls.get(short_code=short_code)
if not url_record:
return None
# Populate cache
await cache.set(f"url:{short_code}", {
"original_url": url_record.original_url,
"expires_at": url_record.expires_at,
"is_active": url_record.is_active
}, ttl=CACHE_TTL["recently_created"])
return url_record.original_url
Redirect Logic
HTTP Redirect Types
| Status | Use Case | Browser Behavior |
|---|---|---|
| 301 | Permanent move | Caches redirect |
| 302 | Temporary redirect | No cache |
| 303 | Post -> Get | Converts to GET |
| 307 | Temporary | Preserves method |
| 308 | Permanent | Preserves method |
For URL shorteners, typically use 301 (permanent) for SEO or 302 (temporary) for analytics tracking.
Redirect Handler
from fastapi import FastAPI, HTTPException, status
from fastapi.responses import RedirectResponse
app = FastAPI()
@app.get("/{short_code}")
async def redirect_to_original(short_code: str):
# Check for special paths
if short_code in ["health", "metrics", "docs"]:
raise HTTPException(status_code=404)
# Validate short code format
if not is_valid_short_code(short_code):
raise HTTPException(status_code=400, detail="Invalid short code")
# Get original URL
original_url = await get_original_url(short_code)
if not original_url:
raise HTTPException(status_code=404, detail="URL not found")
# Track click asynchronously
asyncio.create_task(track_click(short_code))
return RedirectResponse(
url=original_url,
status_code=status.HTTP_302_FOUND
)
Rate Limiting
Prevent abuse with rate limits per IP:
from slowapi import Limiter
from slowapi.util import get_remote_address
limiter = Limiter(key_func=get_remote_address)
@app.post("/shorten")
@limiter.limit("10/minute")
async def create_short_url(request: Request, url: str = Body(...)):
# Validate URL
if not is_valid_url(url):
raise HTTPException(status_code=400, detail="Invalid URL")
# Check if already shortened by this user
existing = await find_existing_mapping(url, request.client.host)
if existing:
return {"short_url": f"https://short.ly/{existing.short_code}"}
short_code = await create_mapping(url)
return {"short_url": f"https://short.ly/{short_code}"}
High Availability Design
Database High Availability
-- PostgreSQL synchronous replication
ALTER SYSTEM SET synchronous_commit = on;
ALTER SYSTEM SET synchronous_standby_names = '*';
-- Read replicas for redirects
CREATE PUBLICATION url_shares FOR TABLE urls;
-- On replica
CREATE SUBSCRIPTION url_sub CONNECTION 'host=primary port=5432 dbname=urlshort' PUBLICATION url_shares;
Multiple Redis Instances
from rediscluster import RedisCluster
# Redis Cluster configuration
rc = RedisCluster(
startup_nodes=[
{"host": "redis-1", "port": 6379},
{"host": "redis-2", "port": 6379},
{"host": "redis-3", "port": 6379}
],
decode_responses=True
)
async def get_original_url(short_code: str) -> Optional[str]:
# Consistent hashing handles failover automatically
cached = await rc.get(f"url:{short_code}")
return json.loads(cached)["original_url"] if cached else None
Geographic Distribution
Deploy redirector clusters in multiple regions:
# Route 53 latency routing
- Name: short.ly
Type: A
SetIdentifier: us-east-1
Region: us-east-1
AliasTarget:
DNSName: dualstack.api-elb-us-east-1.amazonaws.com
EvaluateTargetHealth: true
# EU redirector
- Name: short.ly
Type: A
SetIdentifier: eu-west-1
Region: eu-west-1
AliasTarget:
DNSName: dualstack.api-elb-eu-west-1.amazonaws.com
EvaluateTargetHealth: true
Users are routed to the nearest cluster based on latency.
Analytics Pipeline
Track clicks without slowing redirects:
async def track_click(short_code: str):
# Fire and forget - don't await
asyncio.ensure_future(
kafka.send("clicks", {
"short_code": short_code,
"timestamp": datetime.utcnow().isoformat(),
"user_agent": request.headers.get("user-agent"),
"referer": request.headers.get("referer"),
"ip_hash": hash_ip(request.client.host)
})
)
Click Analytics Consumer
async def process_clicks():
consumer = KafkaConsumer("clicks", bootstrap_servers=["kafka:9092"])
for message in consumer:
event = json.loads(message.value)
# Update click count in background
await db.query("""
UPDATE urls
SET click_count = click_count + 1
WHERE short_code = $1
""", event["short_code"])
# Update analytics warehouse
await warehouse.insert("click_events", event)
Complete API Specification
Endpoints
| Method | Endpoint | Description |
|---|---|---|
| POST | /api/v1/shorten | Create short URL |
| GET | /{short_code} | Redirect to original |
| GET | /api/v1/links/{short_code} | Get link info |
| GET | /api/v1/links/{short_code}/stats | Get click statistics |
| DELETE | /api/v1/links/{short_code} | Delete a link |
| PUT | /api/v1/links/{short_code} | Update link settings |
Request/Response Examples
# Create short URL
curl -X POST https://short.ly/api/v1/shorten \
-H "Content-Type: application/json" \
-d '{"url": "https://example.com/very/long/path/that/needs/shortening"}'
# Response
{
"short_code": "xV2bP9qK",
"short_url": "https://short.ly/xV2bP9qK",
"original_url": "https://example.com/very/long/path/that/needs/shortening",
"created_at": "2026-03-22T10:30:00Z",
"expires_at": null
}
Abuse Prevention and Security
Malicious URL Detection
URL shorteners are frequently abused for phishing, malware distribution, and spam. Implement safeguards:
class MaliciousURLDetector:
"""Detect potentially malicious URLs before shortening"""
def __init__(self, threat_intel_client: ThreatIntelClient):
self.threat_intel = threat_intel_client
self.suspicious_tlds = {
'.tk', '.ml', '.ga', '.cf', '.gq', # Free tier often abused
'.xyz', '.top', '.club' # Often used in spam
}
async def check_url(self, url: str) -> ThreatAssessment:
checks = await asyncio.gather(
self._check_domain_reputation(url),
self._check_url_pattern(url),
self._check_content_scan(url),
self._check_google_safe_browsing(url)
)
if any(check.threat for check in checks):
return ThreatAssessment(
threat=True,
reason="URL flagged by security checks",
severity="high"
)
return ThreatAssessment(threat=False)
async def _check_url_pattern(self, url: str) -> CheckResult:
parsed = urlparse(url)
# Check for suspicious TLDs
if any(parsed.netloc.endswith(tld) for tld in self.suspicious_tlds):
return CheckResult(threat=True, reason="Suspicious TLD")
# Check for IP address instead of domain
if self._is_ip_address(parsed.netloc):
return CheckResult(threat=True, reason="IP address used")
# Check for excessive subdomains
if parsed.netloc.count('.') > 4:
return CheckResult(threat=True, reason="Excessive subdomains")
return CheckResult(threat=False)
Rate Limiting Tiers
RATE_LIMITS = {
"anonymous": {"shorten": "5/hour", "redirect": "100/hour"},
"authenticated_free": {"shorten": "100/hour", "redirect": "1000/hour"},
"authenticated_pro": {"shorten": "10000/hour", "redirect": "100000/hour"},
}
@app.middleware
async def rate_limit_middleware(request: Request, call_next):
user_tier = get_user_tier(request)
if user_tier == "anonymous":
# Rate limit by IP
client_ip = request.client.host
if not await rate_limiter.check_limit(f"ip:{client_ip}", RATE_LIMITS["anonymous"]["shorten"]):
raise HTTPException(status_code=429, detail="Rate limit exceeded")
elif user_tier == "authenticated":
user_id = get_user_id(request)
if not await rate_limiter.check_limit(f"user:{user_id}", RATE_LIMITS[user_tier]["shorten"]):
raise HTTPException(status_code=429, detail="Rate limit exceeded")
return await call_next(request)
Spam Link Prevention
async def create_short_url(url: str, user_id: int = None) -> ShortUrl:
# Validate URL format
if not is_valid_url(url):
raise HTTPException(status_code=400, detail="Invalid URL format")
# Check for known spam domains
if await spam_database.is_spam_domain(extract_domain(url)):
raise HTTPException(status_code=403, detail="URL blocked")
# Require authentication for custom codes
if custom_code and not user_id:
raise HTTPException(status_code=401, detail="Authentication required for custom codes")
# Create short URL
return await url_service.create(url, custom_code, user_id)
Production Failure Scenarios
| Failure Scenario | Impact | Mitigation |
|---|---|---|
| Redis cache failure | All redirects hit DB, high latency | Fallback to direct DB reads; circuit breaker on cache |
| KGS (key gen) failure | Cannot create new short URLs | Use hash-based codes as fallback; KGS recovery priority |
| Database primary failure | Cannot create or redirect | Promote read replica; use eventual consistency for analytics |
| DNS resolution failure | short.ly domain unreachable | Multi-cloud DNS; anycast IP; aggressive caching |
| CDN failure for stats page | Stats load slowly | Static asset caching; local caching |
Cache Failure Handling
async def get_original_url(short_code: str) -> Optional[str]:
try:
# Try cache first
cached = await redis.get(f"url:{short_code}")
if cached:
return json.loads(cached)["original_url"]
except RedisConnectionError:
# Cache unavailable - fall through to DB
pass
# Fallback to database
url_record = await db.urls.get(short_code=short_code)
if not url_record:
return None
# Don't try to repopulate cache if Redis is down
return url_record.original_url
Real-world Failure Scenarios
Scenario 1: Bitly Outage (2010)
What happened: Bitly suffered a major outage affecting all URL shortening services for several hours. The incident exposed critical infrastructure weaknesses in their failover mechanisms.
Root cause: Inadequate redundancy in DNS configuration combined with a cascading failure when the primary database cluster became unavailable. No automatic failover was configured.
Impact: All shortened links returned errors, affecting millions of redirected URLs across social media and marketing campaigns. Service was completely unavailable for ~4 hours.
Lesson learned: Design for failure at every layer. Implement multi-region failover, regular chaos testing, and ensure DNS failover is automatic and tested.
Scenario 2: AWS S3 Availability Zone Failure
What happened: A major cloud provider experienced a partial Availability Zone failure that knocked out access to S3 bucket data in one AZ. Services relying on single-AZ S3 configurations went dark.
Root cause: Some URL shortening services configured S3 as the primary storage backend without cross-AZ replication. When the AZ failed, all read/write operations failed.
Impact: Services storing shortened link metadata and redirect targets in a single AZ experienced complete data unavailability, even though other AZs remained healthy.
Lesson learned: Always use S3 cross-region replication for critical data. Design storage backends to tolerate AZ failures. Test failure scenarios regularly.
Scenario 3: Database Connection Pool Exhaustion
What happened: A popular URL shortener experienced a spike in traffic during a major sporting event, causing database connection pool exhaustion and complete service degradation.
Root cause: The connection pool size was configured based on normal traffic patterns. During the traffic spike, all available connections were consumed by read operations, blocking write operations needed for creating new shortened URLs.
Impact: Users could not create new shortened links even though existing redirects continued to work. The service appeared functional but was effectively in a degraded state for several hours.
Lesson learned: Implement connection pool monitoring, use read replicas to offload read traffic, and configure automatic scaling for database connection pools based on real-time demand metrics.
Common Pitfalls / Anti-Patterns
Pitfall 1: Using Sequential IDs as Short Codes
Problem: Sequential IDs (1, 2, 3…) allow URL enumeration - attackers can guess other short URLs.
Solution: Use cryptographically random codes (base62) with minimum 6 characters. Use KGS for guaranteed uniqueness without predictability.
Pitfall 2: Not Handling URL Expiration
Problem: Expired URLs still redirect until cache expires.
Solution: Check expiration on every redirect. Set aggressive cache TTL for URLs with near-term expiration.
Pitfall 3: Storing Only Short Code in Cache
Problem: Cache miss requires DB query for every redirect.
Solution: Cache generously. Use “recently accessed” eviction. Pre-populate cache for trending links.
Observability Checklist
Metrics to Capture
url_redirects_total(counter) - By short_code prefix, status codeurl_shortens_total(counter) - By user_tier, custom vs autoredirect_latency_seconds(histogram) - P50, P95, P99cache_hit_ratio(gauge) - Cache efficiencymalicious_url_attempts_total(counter) - Blocked attempts by typekgs_available_keys(gauge) - Key generation health
Logs to Emit
{
"timestamp": "2026-03-22T10:30:00Z",
"event": "redirect",
"short_code": "xV2bP9qK",
"status": 302,
"latency_ms": 12,
"cache_hit": true,
"user_ip_hash": "abc123"
}
Alerts to Configure
| Alert | Threshold | Severity |
|---|---|---|
| Redirect latency P99 > 200ms | 200ms | Warning |
| Cache hit ratio < 50% | 50% | Warning |
| KGS keys < 1000 | 1000 | Critical |
| Malicious attempts spike | > 100/min | Warning |
| DB connection pool > 80% | 80% | Warning |
Security Checklist
- TLS 1.3 for all connections
- URL validation and sanitization
- Malicious URL scanning (Google Safe Browsing API)
- Rate limiting per IP and per user
- Custom code length and character validation
- Authentication required for custom short codes
- Audit logging of all URL creations
- GDPR compliance for analytics data
- Regular security audits
Trade-off Analysis
| Design Decision | Option A | Option B | Recommendation |
|---|---|---|---|
| Short Code Generation | Hash-based (MD5/SHA + salt) | Counter-based KGS | KGS for production; hash-based acceptable for simple cases |
| Code Length | 6 chars (56.8B combos) | 7-8 chars (3.5T - 218T) | 7-8 chars to avoid birthday paradox collisions |
| Redirect Status | 301 Permanent | 302 Temporary | 302 to avoid SEO consolidation issues |
| Cache Strategy | Cache URL only | Cache full URL object with metadata | Cache full object to avoid second round-trip |
| Custom Codes | Require authentication | Allow anonymous | Require authentication + premium pricing |
| URL Deduplication | Per-user dedup | Allow duplicates | Per-user dedup for efficiency |
| Analytics Tracking | Synchronous write | Async fire-and-forget | Always async to keep redirect fast |
| Database | PostgreSQL | DynamoDB/Cassandra | PostgreSQL for <1B URLs; NoSQL for billions |
| KGS Recovery | Return orphaned codes after timeout | Fall back to hash-based codes | Both strategies needed |
When to Choose Each Option
Hash vs KGS for Code Generation:
- Use hash-based when: simplicity is priority, collision handling is acceptable, moderate scale
- Use KGS when: predictability required, high availability critical, cannot tolerate collision checking latency
PostgreSQL vs NoSQL:
- Use PostgreSQL when: need ACID transactions, moderate scale, team expertise available
- Use DynamoDB/Cassandra when: billions of rows, global distribution, multi-region writes required
Interview Questions
The cleanest solution is a Key Generation Service (KGS) — a separate service that pre-generates short codes and hands them out on demand. Since the codes come from a known set, collisions literally cannot happen. The tradeoff is that KGS becomes a critical dependency; if it goes down, you cannot create new URLs.
Other approaches exist. You could salt the URL and hash it, then check whether that hash has been used before storing. Or you could use a counter — each new URL gets the next number in sequence, encoded in base62. The counter approach is simple but predictable, which is its own problem.
Base64 uses characters like '+', '/', and '=' that have special meaning in URLs. A base64 character in a URL needs encoding, which defeats the purpose of a short URL — you'd end up with something like example.com/abc%2Bdef%3D instead of example.com/abcdef.
Base62 sticks to alphanumeric characters only (A-Z, a-z, 0-9). Every character is URL-safe without any encoding. Six characters gives you 56.8 billion combinations — more than enough for any practical use case.
Predictable codes invite enumeration attacks — if I can guess xV2bP9, I can probably guess xV2bP8 and xV2bP7. From there, it's a short hop to scraping the entire URL database.
The fix is randomness. Six characters from base62 gives you 56.8 billion possible codes — too many to guess. The KGS approach generates codes with a cryptographically secure random function before they're needed. If you're hashing URLs instead, you need a secret salt and a strong hash; without the salt, attackers can precompute popular URL hashes.
Never use sequential IDs. They're predictable and they tell attackers exactly how many URLs you've created.
Redirect latency is the whole game here. A cache miss means a database lookup, which can be 10-100x slower.
The practical approach: cache the full URL object, not just the short code. On a cache miss you fetch the whole record; on a hit you return immediately. Write-through on create so new URLs are immediately available. Hot links (the top 1% of URLs by traffic) deserve longer TTLs — sometimes hours instead of minutes. Cold links can expire faster.
One thing teams often get wrong: caching only the redirect URL. If you cache the whole object including metadata, you avoid a second round-trip when you need to check expiration or click counts.
Expiration is checked on every redirect — the handler verifies that expires_at is either null or in the future. If the link has expired, you return 404.
Cached entries need special handling. A link that expires in 10 minutes should not sit in cache for an hour. One approach: store the TTL alongside the cached object and check expiration at read time. Another: use shorter TTLs for URLs with near-term expiration dates.
Background jobs handle the actual cleanup. You don't need to delete expired entries immediately — they're harmless as long as they don't redirect — but you should clean them up eventually to avoid unbounded table growth.
Everything should keep working, just slower. With a circuit breaker in place, the system detects that Redis is unhealthy and stops attempting cache operations. Redirects fall through to the database directly.
This is why the database schema needs proper indexes — it's the fallback for every request when cache is unavailable. Without those indexes, you'd see timeouts under cache failure, which defeats the purpose of having a fallback at all.
Monitoring is critical here. When cache hit ratio drops suddenly, that's your signal that something is wrong before it becomes a full outage.
You've got to check at creation time, not at redirect time. By the time a user clicks a malicious link, damage may already be done.
The layers that matter: pattern validation catches obvious abuse (IP addresses instead of domains, suspicious TLDs). Domain reputation databases catch known bad actors. For thoroughness, scan the URL content when possible — Google Safe Browsing API covers a lot of ground. Rate limiting per IP slows down anyone trying to mass-generate malicious links.
You also need a takedown mechanism for links reported after creation. If someone submits a legitimate URL that later becomes malicious, you need to be able to invalidate it quickly.
Custom codes need authentication — otherwise people squat on popular terms. Require login to claim a custom code, validate it meets your format rules (minimum length, allowed characters, not on the reserved list), and charge for it if you're running a business. Premium pricing alone deters most abuse.
The reserved list matters. Things like /admin, /health, /metrics — these should never be available as custom codes regardless of who asks.
PostgreSQL handles most URL shortener workloads fine. The short_code gets a unique index for fast lookups — this is the only query path that matters for redirects. Everything else (creator_id, expires_at) can live on secondary indexes.
At extreme scale — billions of URLs — PostgreSQL starts showing cracks. Cassandra and DynamoDB both scale horizontally without the manual sharding overhead. The query pattern is simple (look up by short_code), which maps well to partitioned NoSQL designs. You lose some query flexibility but gain almost unbounded horizontal scalability.
The redirect has to be fast. Anything that blocks the response — including analytics writes — adds latency.
The standard pattern: fire-and-forget. The redirect handler returns the 302 immediately and spawns an async task. That task sends the click event to a message queue (Kafka, SQS, whatever you prefer). A separate consumer picks up events in batches and updates the analytics store.
This decouples the hot path from the analytics path entirely. Your p99 redirect latency stays low even when analytics load is high.
Short codes mean shorter URLs, which is the whole point. But each additional character multiplies the available keyspace exponentially.
Six base62 characters give you 56.8 billion combinations. At a million new URLs per day, you'd need 56,800 years to exhaust them — so six characters is plenty for practical purposes. Seven characters bumps you to 3.5 trillion, which is absurdly generous.
The collision risk from shorter codes isn't about exhaustion — it's about birthday paradox math. With 56.8 billion possible codes and 182.5 billion URLs over five years, collisions become statistically likely. This is why most production systems use seven or eight characters despite six being technically sufficient.
Hash-based generation is simple — hash the URL, encode, done. No coordination needed. The problem is predictability and collision handling. Without a salt secret, attackers can precompute hashes for common URLs. With a salt, you still need to handle collisions by appending random data, which complicates the flow.
Counter-based KGS eliminates collisions entirely — codes come from a pre-generated list. It also gives you predictable, uniform codes. The cost is operational complexity: KGS becomes a critical service. If it goes down, you can't create new URLs. You also need to handle KGS recovery after failures — the batch allocation logic must be idempotent.
For high-availability requirements, KGS with a warm standby is worth the complexity. For simpler use cases, hash-based with collision handling is easier to operate.
You have two philosophies here. The first: deduplicate and return the existing short URL. If the same user submits the same URL twice, give them the same short code. This is efficient — fewer entries in your database — but it breaks the assumption that each short code belongs to one URL. It also means one user can "claim" a URL that another user wanted.
The second approach: treat each submission as unique, even for identical URLs. This preserves the one-to-one mapping and is simpler to reason about, but wastes keyspace on duplicates.
Most production systems do a hybrid: check if the submitting user already has a short code for that URL (deduplicate per user), but allow different users to create different short codes for the same long URL. The cache and database handling stays simple.
PostgreSQL handles millions of rows without complaint if you have the right indexes. The short_code unique index performs well even with hundreds of millions of entries. The problems start when you need to scale writes or distribute geographically.
Read replicas help with redirect read throughput — you can route reads to replicas and writes to primary. But there's a catch: replication lag means occasionally a brand-new URL might not be on a replica yet, so a redirect could 404 even though the URL was just created. Acceptable for most use cases, but something to monitor.
At billions of rows, table partitioning becomes necessary. You can partition by creation date, then route redirects to the right partition based on short code prefix. DynamoDB and Cassandra handle this scale natively with automatic partitioning, but you give up PostgreSQL's query flexibility.
Users in different regions expect instant redirects. If your primary database is in us-east-1, a user in Tokyo sees 100ms+ latency just for the database lookup before the redirect even starts.
The usual approach: regional Redis caches that replicate asynchronously. Writes go to the primary region and propagate to regional caches. The trade-off is eventual consistency — a URL created in us-east-1 might not be redirectable from eu-west for a few seconds.
For most URL shorteners, eventual consistency is fine. Users rarely create and immediately test from opposite sides of the world. The p99 latency improvement from regional caching outweighs the brief inconsistency window.
Custom codes bypass the random generation process, which means they can collide with system paths. If someone registers "/health" as a custom code, your redirect handler gets confused — that path is supposed to be your health check endpoint.
The reserved list must be enforced at creation time, not just checked at redirect time. Block /health, /metrics, /docs, /api/*, and any other internal paths. Also block codes that look like valid base62 but decode to ASCII — "admin" encoded in base62 is a real short code someone could claim.
Character validation matters too. A custom code containing SQL injection characters or path traversal sequences ("../../../etc/passwd") needs sanitization even if it passes your base62 filter. Validate strictly: alphanumeric only, minimum 4 characters, maximum 12.
The shortener's job is redirect, not reliability. You redirect to whatever URL the user provided — you're not responsible for that URL being up. But there are legitimate cases where you'd want to detect and handle dead links.
Some systems do proactive validation: when a URL is shortened, scan it to verify it responds without error. This adds creation latency and still doesn't guarantee the URL stays up. Not worth the complexity for most use cases.
For premium tiers, periodic revalidation makes sense. Check every hour whether premium short URLs still respond. If a link goes dead, alert the owner and optionally auto-expire the short code. This is a business feature, not a technical requirement.
KGS allocates codes in batches. Say it allocates codes 1000-2000 to instance A, then crashes. Those codes are "reserved" but not yet assigned to any URL. If you don't handle this, codes 1000-2000 are effectively lost — they exist in the KGS state but no one can claim them.
The standard recovery approach: KGS writes its batch allocation to durable storage (database or distributed lock) before handing out codes. If it crashes, on restart it reads the last known allocation and recovers. Orphaned codes get returned to the available pool after a timeout.
During recovery, new URL creation can fall back to hash-based codes. This keeps the service running, just with less predictable codes. Monitor how often you're falling back — frequent KGS failures indicate it needs more redundancy.
Search engines interpret 301 (permanent) as a signal that the short URL is the "canonical" version. They consolidate page rank from the long URL to the short one. This is good if you want the short URL to rank — but then you're stuck with it. If you ever need to change the mapping, you've lost SEO value.
302 (temporary) tells search engines this is a temporary redirect. They keep both URLs in the index and don't transfer page rank. Better for tracking — you can change the destination without losing the SEO identity of the short URL.
Most URL shorteners use 302 specifically to avoid SEO complications. You don't want your service being treated as the authoritative source for content that lives elsewhere. The trade-off is that search engines don't consolidate ranking signals, which is actually what you want.
Redirect QPS estimation: (daily_creates * redirect_ratio) / seconds_per_day. With 100M creates/day and 100:1 redirect ratio, that's 115,000 QPS. Your system needs to handle spikes — 2x average at minimum during viral moments.
Storage estimation: creates_per_day * bytes_per_url * days_retained. At 500 bytes per URL and 5-year retention, 100M creates/day needs 91TB. Account for indexes and overhead — real storage needs are 2-3x the raw calculation.
Cache sizing: hot URL distribution follows zipf's law. The top 1% of URLs get 99% of redirects. Size your cache to hold at least the top 10,000 URLs with their metadata. Cache misses for the long tail are acceptable — those URLs barely get accessed.
Further Reading
- Caching Strategies — Deep dive into Redis caching patterns
- Load Balancing — Geographic distribution and routing
- Database Scaling — Read replicas and partitioning
- Rate Limiting — Anti-abuse patterns
- Distributed Caching — Multi-node cache architectures
For more system design patterns, see our Caching Strategies guide which covers caching patterns used here. The Load Balancing guide covers geographic distribution.
Conclusion
A URL shortener looks simple on paper. The complexity shows up at scale. The key decisions:
- Short code generation: Counter-based with KGS ensures uniqueness and predictability
- Caching: Aggressive caching with Redis handles redirect traffic
- Database: PostgreSQL with read replicas for availability
- Analytics: Asynchronous click tracking via Kafka
Quick Recap Checklist
Before deploying your URL shortener to production, verify these essentials:
Core Functionality
- Short code generation using KGS or hash-with-salt
- Base62 encoding with 6-8 character output
- URL validation and sanitization
- Custom code reservation for system paths
- Expiration handling on every redirect
Performance
- Redis caching with write-through on create
- Cache TTL differentiated by URL type (hot/cold/custom)
- Database indexes on short_code for fast lookups
- Read replicas for redirect traffic
- Async analytics (fire-and-forget click tracking)
Availability
- Circuit breaker for cache failures
- Database fallback when cache unavailable
- Multi-region deployment with latency-based routing
- KGS warm standby for high availability
- Connection pool monitoring and auto-scaling
Security
- Rate limiting per IP and per user tier
- Malicious URL scanning (Safe Browsing API)
- Authentication required for custom codes
- Reserved path blocking at creation time
- TLS 1.3 for all connections
- Audit logging for URL creations
Observability
- Redirect latency histograms (P50, P95, P99)
- Cache hit ratio monitoring
- KGS available keys gauge
- Alert thresholds configured for all critical metrics
Category
Related Posts
System Design: Netflix Architecture for Global Streaming
Deep dive into Netflix architecture. Learn about content delivery, CDN design, microservices, recommendation systems, and streaming protocols.
System Design: Twitter Feed Architecture and Scalability
Deep dive into Twitter system design. Learn about feed generation, fan-out, timeline computation, search, notifications, and scaling challenges.
Amazon Architecture: Lessons from the Pioneer of Microservices
Learn how Amazon pioneered service-oriented architecture, the famous 'two-pizza team' rule, and how they built the foundation for AWS.