Redis vs Memcached: Choosing an In-Memory Data Store

A comprehensive comparison of Redis and Memcached — data structures, persistence, clustering, Lua scripting, pub/sub, and guidance on when to choose each.

published: reading time: 39 min read author: GeekWorkBench

Redis vs Memcached: Choosing an In-Memory Data Store

Introduction

Both sit in front of your database and cache frequently accessed data in memory. Developers often use them interchangeably without understanding the differences. The differences matter: Redis is a data structure server that also happens to support caching. Memcached is a caching engine with a simpler data model. That distinction shapes what you can build with them and how you debug them at 2am.

This is not a “both are good” comparison. I will tell you when each one makes sense.

Core Concepts

Memcached stores strings and nothing but strings. You give it a key, you get back a value. That is the whole API.

Redis supports strings, lists, hashes, sets, sorted sets, bitmaps, hyperloglogs, geospatial indexes, and streams. It can act as a cache, a session store, a message broker, a rate limiter, and a real-time analytics engine. Note that volatile-lru and allkeys-lru use an approximated LRU algorithm (sampled LRU), not true LRU — Redis picks a random set of keys and evicts the least recently used among them. This is a memory-efficient approximation. volatile-lru applies the same sampled eviction only to keys with a TTL set, while allkeys-lru applies it across all keys.

# Memcached: everything is a string
memcached.set("user:123", json.dumps(user_data))
user_data = json.loads(memcached.get("user:123"))

# Redis: native data structures
redis.hset("user:123", mapping=user_data)
user_data = redis.hgetall("user:123")

Performance depends on what you are doing with them.


Redis Data Structures

Strings

Both handle simple string values. Redis just has more ways to manipulate them.

# Memcached
set key "value"
get key

# Redis
set key "value"
get key

# Redis extras
append key " more"     # Append to existing
incr count             # Atomic increment
decr count             # Atomic decrement
setrange key 0 "re"    # Overwrite bytes
getrange key 0 3       # Substring retrieval

Lists

Memcached does not have lists. Redis does.

# Redis lists: ordered, push/pop from either end
redis.lpush("queue:jobs", "job1", "job2", "job3")
redis.rpop("queue:jobs")  # Returns "job1" (oldest)
redis.lrange("queue:jobs", 0, -1)  # Get all

# Common use: recently viewed items, job queues, activity logs
redis.lpush("user:123:views", product_id)
redis.ltrim("user:123:views", 0, 19)  # Keep last 20

Sets and Sorted Sets

Memcached has no sets. Redis has both.

# Redis sets: unique, unordered
redis.sadd("user:123:likes", "product1", "product2", "product3")
redis.smembers("user:123:likes")
redis.sismember("user:123:likes", "product1")  # O(1) check

# Redis sorted sets: scored sets for leaderboards, priorities
redis.zadd("leaderboard", {"player1": 100, "player2": 200, "player3": 150})
redis.zrevrange("leaderboard", 0, 9, withscores=True)  # Top 10
redis.zrank("leaderboard", "player2")  # Get rank

Hashes

Memcached has no hashes. Redis does.

# Redis hashes: objects without serialization overhead
redis.hset("user:123", "name", "Alice", "email", "alice@example.com")
redis.hget("user:123", "name")  # "Alice"
redis.hgetall("user:123")  # All fields

# vs Memcached requiring JSON serialization
memcached.set("user:123", json.dumps({"name": "Alice", "email": "..."}))

Eviction Policies

Both support similar eviction policies when memory is full.

# Memcached eviction
# -no-eviction: return error on out-of-memory
# -allkeys-lru: evict least recently used of all keys
# -allkeys-random: evict random
# -volatile-lru: evict LRU of keys with TTL
# -volatile-ttl: evict shortest TTL
# -volatile-random: evict random of keys with TTL

memcached -o expire_counter,merge_threshold,ev=volatile-lru

# Redis maxmemory policies
# allkeys-lru, allkeys-random, allkeys-lfu, allkeys-ttl
# volatile-lru, volatile-lfu, volatile-random, volatile-ttl
# noeviction

maxmemory 100mb
maxmemory-policy allkeys-lru

The policies are nearly identical. Redis adds LFU (Least Frequently Used) which Memcached does not have.

Cache Invalidation Strategies

“Cache aside” (lazy loading) is the most common pattern, but there are several strategies with different trade-offs.

Write-Through Cache

Data is written to both cache and database synchronously. Reads always hit cache.

def set_user(user_id, data):
    # Write to cache AND database together
    redis.set(f"user:{user_id}", json.dumps(data))
    db.users.update(user_id, data)
    return data

def get_user(user_id):
    # Cache is always fresh
    cached = redis.get(f"user:{user_id}")
    if cached:
        return json.loads(cached)
    # Fallback only if cache miss
    data = db.users.get(user_id)
    redis.set(f"user:{user_id}", json.dumps(data))
    return data

Pros: Strong consistency — cache always matches database. Simple read path (always read from cache).

Cons: Write latency is higher (two writes). Cache can be stale if database write succeeds but cache write fails (use transactions).

Best for: Write-heavy workloads where data must always be current, configuration data, reference data.

Write-Behind Cache (Write-Back)

Data is written to cache only. Database is updated asynchronously.

def set_user(user_id, data):
    # Write to cache only — fast
    redis.set(f"user:{user_id}", json.dumps(data))
    # Async write to database via queue
    queue.enqueue("db:users:upsert", {"user_id": user_id, "data": data})
    return data

Pros: Very fast writes. Reduces database load during write spikes.

Cons: Risk of data loss if cache fails before database is updated. Requires additional infrastructure (write queue, retry logic). Cache can be inconsistent across nodes during propagation.

Best for: Write-heavy workloads where occasional data loss is acceptable (metrics, analytics, leaderboards), session data.

Cache-Aside (Lazy Loading)

The application manages cache explicitly — reads populate cache on miss, writes update database and invalidate cache.

def get_user(user_id):
    # Read from cache first
    cached = redis.get(f"user:{user_id}")
    if cached:
        return json.loads(cached)
    # Cache miss — load from database
    data = db.users.get(user_id)
    # Populate cache for next time
    redis.setex(f"user:{user_id}", 3600, json.dumps(data))
    return data

def set_user(user_id, data):
    # Write to database
    db.users.update(user_id, data)
    # Invalidate cache — do not update it
    redis.delete(f"user:{user_id}")
    return data

Pros: Simple. Cache never has stale writes (invalidated on update). Read-heavy workloads naturally populate cache. No write amplification.

Cons: Cache miss penalty — first read after invalidation or startup is slow. “Thundering herd” problem on popular keys.

Best for: Read-heavy workloads, most general-purpose caching, when database is source of truth.

TTL and Invalidation Details

TTL Selection Criteria

Choosing TTL is a trade-off between staleness and cache efficiency.

Data TypeRecommended TTLRationale
User sessions24-48 hoursSessions have natural expiry; 48h covers timezone gaps
Configuration data5-15 minutesChanges need to propagate; short enough to recover
API responses (public)1-5 minutesFresh data important; short TTL limits staleness
Analytics / aggregates15-60 minutesTolerates some staleness; longer = better hit rate
Product catalog1-24 hoursUpdates are infrequent; long TTL = better hit rate
Leaderboards30-300 secondsNeeds near-real-time accuracy; short TTL required

Quick heuristic: Pick TTL based on how stale your data is allowed to be. If you cannot answer “how stale can this be?”, set it to 60 seconds or less.

Invalidation Patterns

Delete vs Expire:

# Delete: immediate removal
redis.delete("user:123")

# Expire: time-based removal
redis.setex("temp:data", 300, value)  # Auto-removes in 5 min

# Use expire for: temporary data, cached computations
# Use delete for: data that changed, explicit updates

Invalidation on update vs refresh on read:

# Option A: Invalidate on write (cache-aside)
def update_user(user_id, data):
    db.users.update(user_id, data)
    redis.delete(f"user:{user_id}")  # Next read fetches fresh

# Option B: Refresh on write (write-through variant)
def update_user(user_id, data):
    db.users.update(user_id, data)
    redis.setex(f"user:{user_id}", 3600, json.dumps(data))  # Write new value

# Option A (invalidate) is preferred because:
# - Avoids write amplification (only invalidates, does not rewrite)
# - Simpler: no need to serialize and store on every write
# - Cache stays consistent if write fails (delete does not happen)

Avoiding the Thundering Herd

When a popular cache key expires, many requests simultaneously hit the database.

# BAD: Many requests see cache miss, all hit database
def get_product(product_id):
    cached = redis.get(f"product:{product_id}")
    if not cached:
        data = db.products.get(product_id)  # ALL requests hit DB
        redis.setex(f"product:{product_id}", 300, json.dumps(data))
    return json.loads(cached) if cached else data

# GOOD: Single request refreshes, others wait
import threading
import time

def get_product_safe(product_id, lock_ttl=10):
    cached = redis.get(f"product:{product_id}")
    if cached:
        return json.loads(cached)

    lock_key = f"lock:product:{product_id}"
    # Try to acquire lock
    if redis.set(lock_key, "1", nx=True, ex=lock_ttl):
        # We got the lock — refresh from database
        data = db.products.get(product_id)
        redis.setex(f"product:{product_id}", 300, json.dumps(data))
        redis.delete(lock_key)
        return data
    else:
        # Another request is refreshing — wait and retry
        time.sleep(0.1)
        cached = redis.get(f"product:{product_id}")
        if cached:
            return json.loads(cached)
        return get_product_safe(product_id, lock_ttl)  # Retry

Alternative: Probabilistic early expiration (XFetch):

import hashlib
import random

def get_with_xfetch(key, beta=1.0):
    """XFetch: probabilistic early expiration to prevent thundering herd"""
    value = redis.get(key)
    if value:
        # Check if we should refresh early (probabilistic)
        ttl = redis.ttl(key)
        if ttl > 0:
            # Regenerate earlier if: random() < exp(-ttl/beta)
            if random.random() < math.exp(-ttl / beta):
                # Background refresh (in production, use separate thread/queue)
                return value, True  # "stale" flag to caller
    return value, False

# Usage: serve stale data while refreshing in background

Cache Warming and Cold Starts

When the Cache Starts Cold

When a cache starts empty (restart, deployment, failure), every request hits the database.

# BAD: Cold cache causes database overload at startup
# All 10,000 users hit DB simultaneously after Redis restart

# GOOD: Pre-warm cache before taking traffic
def warm_cache():
    """Run at startup before accepting traffic"""
    popular_keys = db.products.get_top_100()  # Identify hot data
    for product in popular_keys:
        redis.setex(f"product:{product.id}", 3600, json.dumps(product))
    # Now safe to take traffic

# Better: Progressive warming with rate limiting
def warm_cache_progressive():
    keys_to_warm = get_keys_by_priority()  # Sort by access frequency
    for i, key in enumerate(keys_to_warm):
        data = db.fetch(key)
        redis.setex(f"cache:{key}", get_ttl_for(key), data)
        # Rate limit: 1000 keys per second to avoid overwhelming DB
        if i % 1000 == 0:
            time.sleep(1)

Keeping Frequently-Used Data Hot

Monitor cache temperature — how often each key is accessed.

# Track key access frequency
def access_key(key):
    # Increment access counter (atomic)
    redis.hincrby("key:access", key, 1)
    return redis.get(key)

# Analyze access patterns weekly
def analyze_access():
    # Find keys not accessed in 7 days — low priority for warming
    # Find top 1000 accessed keys — priority for staying cached
    hot_keys = redis.zrevrange("key:access", 0, 999, withscores=True)

    # Set aggressive TTL on hot keys
    for key, score in hot_keys:
        current_ttl = redis.ttl(f"cache:{key}")
        if current_ttl < 3600:  # Less than 1 hour
            redis.expire(f"cache:{key}", 86400)  # Extend to 24 hours

Warming Patterns in Practice

# Pattern 1: Scheduled pre-warming before high-traffic events
def warm_for_black_friday():
    """Run 30 minutes before expected traffic spike"""
    # Pre-compute and cache popular product pages
    top_products = db.get_products_by_category("popular", limit=500)
    for product in top_products:
        redis.setex(f"product:{product.id}", 7200, compute_product_page(product))
    # Pre-warm user sessions that will be active
    active_user_ids = db.get_users_logged_in_recently(limit=10000)
    for user_id in active_user_ids:
        redis.setex(f"session:{user_id}", 172800, load_user_session(user_id))

# Pattern 2: Proactive caching on database write
def write_with_proactive_cache(user_id, data):
    # Write to database
    db.users.update(user_id, data)
    # Proactively cache the result
    redis.setex(f"user:{user_id}", 86400, json.dumps(data))
    # Also cache related data
    redis.setex(f"user:{user_id}:profile", 86400, json.dumps(data["profile"]))

# Pattern 3: Background refresh for critical keys
def start_background_refresh(key, compute_fn, ttl=300):
    """Refresh key in background before expiry"""
    def refresh():
        while True:
            value = compute_fn()
            redis.setex(key, ttl, value)
            time.sleep(ttl * 0.8)  # Refresh at 80% of TTL
    thread = threading.Thread(target=refresh, daemon=True)
    thread.start()

Persistence

Memcached: Pure Memory

Memcached is pure memory. It never touches disk. When it restarts, everything is gone.

# Memcached has no persistence options
# Restart = empty cache

This sounds like a drawback, but for pure caching it is fine. Your source of truth is the database anyway.

Redis: Optional Persistence

Redis persists to disk. You can survive restarts without losing data.

# RDB snapshots: point-in-time dumps
save 900 1    # Save if 1 key changed in 900 seconds
save 300 10   # Save if 10 keys changed in 300 seconds
save 60 10000 # Save if 10000 keys changed in 60 seconds

# AOF (Append Only File): every write logged
appendonly yes
appendfsync everysec  # fsync every second (balance of speed/safety)

# Or no persistence at all (pure cache mode)
save ""

Redis persistence is configurable. You can turn it off for pure caching or enable it for durability.

Performance and Clustering

Performance

Raw performance depends on your workload. Here is a general comparison:

OperationMemcachedRedis
GET/SET (simple)Very fastFast
MGET/MSET (batch)FasterSlower (per-key overhead)
INCR (atomic counter)FastVery fast
Sets/Lists/HashesNot supportedDepends on operation
Memory efficiencyBetter (simple values)Depends on data structures

For simple string caching, Memcached often uses less memory per key. For complex data structures, Redis’s overhead is usually worth it.

Redis uses single-threaded execution (one command at a time per connection, but multiple connections). Memcached is multi-threaded. On a single instance, Redis can saturate network bandwidth. Memcached scales better on multi-core for raw throughput.

# Redis pipelining: batch commands to reduce round trips
pipe = redis.pipeline()
for key in keys:
    pipe.get(key)
results = pipe.execute()  # One round trip for all

Clustering and Distribution

Memcached

Memcached has no native clustering. You shard across instances manually using consistent hashing.

import hashlib

class ConsistentHash:
    def __init__(self, servers):
        self.servers = servers
        self.ring = {}
        self.sorted_keys = []

        for server in servers:
            for i in range(150):
                key = f"{server}:{i}"
                hash_key = int(hashlib.md5(key.encode()).hexdigest(), 16)
                self.ring[hash_key] = server
                self.sorted_keys.append(hash_key)

        self.sorted_keys.sort()

    def get_server(self, key):
        hash_key = int(hashlib.md5(key.encode()).hexdigest(), 16)
        for sorted_key in self.sorted_keys:
            if hash_key <= sorted_key:
                return self.ring[sorted_key]
        return self.ring[self.sorted_keys[0]]

It works. You are just managing the sharding yourself.

Redis

Redis has built-in clustering with automatic sharding.

# Redis Cluster configuration
cluster-enabled yes
cluster-config-file nodes.conf
cluster-node-timeout 15000

# Automatic sharding, replication, and failover
# Your application sees a single logical database

Redis Cluster partitions keys across nodes automatically. It also supports replication for read scaling and failover.

Capacity Estimation: Memory-per-Key and Cluster Slot Planning

Memory per key differs significantly between Redis and Memcached.

Redis memory breakdown per key (string value):

ComponentSize
Key pointer~56 bytes (SDS allocator)
Value storageActual value size
Redis object overhead~16 bytes
Dictionary entry (if in hash)~32 bytes
Total minimum per key~72 bytes + value

Memcached memory breakdown per key (string value):

ComponentSize
KeyKey length
ValueActual value size
flags byte1 byte
CAS token (optional)8 bytes
Expiry time4 bytes
Overhead per item~25 bytes
Total minimum per item~25 bytes + key + value

For a cache with 1 million keys, each storing a 200-byte value:

  • Redis string: ~72M overhead + 200M data = ~272M total
  • Memcached: ~25M overhead + key_space + 200M data = ~240M + key_space

Memcached wins on simple string workloads by 10-20% memory efficiency. Redis pays the overhead for richer data structures.

Redis Cluster slot planning: 16,384 slots divided across N master nodes. For a 6-node cluster (3 masters + 3 replicas), each master owns ~5,461 slots. Slot ownership determines which node stores which keys. The formula: slot = CRC16(key) % 16384. When planning capacity, ensure each master has headroom — if one master owns 5,461 slots and your average key is 1KB with 100K keys per slot, that node needs roughly 5GB. Plan for 2x headroom.

Memcached cluster sizing: No slots — consistent hashing distributes keys. Target 150-200 virtual nodes per physical node for even distribution. With N nodes and V virtual nodes each, the coefficient of variation (CV) of key distribution should stay below 0.3. Formula: CV ≈ 1/√(N × V). For CV < 0.3 with V = 150, you need N ≥ 12 nodes for even distribution. Fewer nodes means higher variance in distribution.

Comparative Analysis Tables

Cache Invalidation Strategy Comparison

StrategyWrite LatencyRead ConsistencyData Loss RiskComplexityBest For
Write-ThroughHigh (sync)Always freshNoneLowWrite-heavy, consistency-critical
Write-BehindLow (async)Eventually consistentHighHighWrite spikes, high-throughput
Cache-AsideLow (1 write)Strong (on invalidation)LowMediumRead-heavy, general purpose

Redis vs Memcached: When to Use Which

FactorRedisMemcachedWinner
Data structuresStrings, Lists, Sets, Hashes, Sorted SetsStrings onlyRedis
Memory efficiencyHigher per-key overhead (~72 bytes + value)Lower per-key overhead (~25 bytes + value)Memcached
PersistenceRDB snapshots, AOF logsNone (pure memory)Redis
ClusteringBuilt-in cluster with slotsClient-side consistent hashingRedis
Threading modelSingle-threaded (no locks)Multi-threaded (global lock per operation)Memcached (throughput), Redis (consistency)
Atomic operationsINCR, SETNX, Lua scriptsCAS tokens onlyRedis
Pub/SubNative supportNot supportedRedis
LatencySub-millisecond, predictableSub-millisecond, predictableTie
Operational complexityHigher (config, persistence)Lower (stateless)Memcached
Production maturityVery mature at scaleMatureTie

Eviction Policy Comparison

PolicyRedis SupportMemcached SupportBehavior
LRU (Least Recently Used)allkeys-lru, volatile-lruallkeys-lru, volatile-lruEvict least recently accessed
LFU (Least Frequently Used)allkeys-lfu, volatile-lfuNot supportedEvict least frequently accessed
TTLvolatile-ttlvolatile-ttlEvict shortest TTL first
Randomallkeys-random, volatile-randomallkeys-random, volatile-randomRandom eviction
No evictionnoevictionnoevictionReturn error on OOM

Redis LFU advantage: LFU tracks frequency, not just recency. For data that is accessed frequently for a period then becomes cold, LFU prevents it from being evicted by a single recent access spike. Memcached does not have this capability.

Connection Management Trade-offs

AspectSingle ConnectionConnection PoolPresized Pool
Setup costHigh (connect latency)Medium (pool creation)Low
Concurrent requestsPoor (blocks)GoodBest
Resource usageLow (1 socket)Medium (N sockets)Medium
ComplexitySimpleModerateSimple
Best forScripts, short-livedWeb applicationsHigh-throughput

When to Use / When Not to Use

Memcached makes sense for simple string caching — HTML fragments, API responses, session data — where maximum memory efficiency matters and you do not need complex data structures. It scales horizontally via consistent hashing, and the operational surface is small. Use it for database query results that fit naturally in key-value form, and for things that do not change often and benefit from sub-millisecond access.

Redis makes sense when you need lists, sets, sorted sets, or hashes; when you want optional persistence; when you need atomic counters for rate limiting or distributed locks; when you need pub/sub for real-time features or chat; when you want built-in clustering; or when you are building leaderboards, job queues, or caching with complex data access patterns. Lua scripting adds atomic multi-step operations without race conditions.


A Practical Decision Framework

Do you need anything beyond simple string key-value?
  YES -> Redis
  NO  -> Does memory efficiency matter more than features?
          YES -> Memcached
          NO  -> Redis (for easier operations and clustering)

If you are not sure, start with Redis. The extra memory usage is negligible for most workloads. If you later find memory is tight and profiling shows Memcached is meaningfully better, switch.


Production Failure Scenarios and Trade-off Analysis

FailureImpactMitigation
Redis/Memcached OOMCache returns errors; application falls back to databaseMonitor used_memory/maxmemory ratio; set alerts at 70% threshold
Redis fork for RDB saveBrief blocking during fork; memory doubles during copy-on-writeSchedule RDB saves during low-traffic; use AOF instead for persistence
Memcached restartAll data lost immediately (no persistence)Design for cold cache; implement application-level cache warming
Redis replica lagReads from replica may return stale dataMonitor replication_backlog_histlen; read from primary for consistency-critical data
Connection pool exhaustionRequests timeout waiting for connectionSize connection pool appropriately; implement request queuing with timeout
Single-threaded Redis blockingLong commands block all other commandsAvoid KEYS, SMEMBERS on large sets; use pipeline/batch operations
Memcached multi-thread contentionHigh CPU under heavy loadScale horizontally with consistent hashing; consider Redis for complex workloads

Detailed Failure Scenarios

Case 1: Redis OOM During Peak Traffic

What happened: A Redis instance reached maxmemory during a flash sale. Eviction policy was noeviction. Redis started returning errors instead of serving requests.

Root cause: The maxmemory-policy was set to noeviction (return error on OOM) instead of allkeys-lru. Additionally, the application was not handling cache errors gracefully — it failed fast instead of falling back to the database.

Impact: 12% of requests failed during a 45-minute window. The database was under-provisioned for the fallback load and also started timing out.

Lesson learned: Always use an eviction policy that allows Redis to keep serving. Implement circuit breakers so the application falls back to the database gracefully when cache errors spike. Monitor evicted_keys and used_memory metrics.

Case 2: Memcached Restart Storm

What happened: A Memcached node was restarted for a configuration update. Within 90 seconds, the database was overwhelmed by cold-cache requests from all application servers simultaneously.

Root cause: No cache warming strategy. All 50 application instances started with empty local caches and hit the database for the same popular keys simultaneously. The database had no protection against this concurrent access pattern.

Impact: Average response time jumped from 15ms to 8,400ms. Database CPU hit 100%. The restart took 3 minutes longer than expected because the database was too overloaded to respond to health checks.

Lesson learned: Implement cache warming before taking a cache node out of rotation. Use consistent hashing with virtual nodes so individual key popularity does not spike on single nodes after restart. Consider using a local L1 cache (in-memory LRU) in front of Memcached to absorb cold-start load.

Case 3: Redis Pipeline Blocking on Large Set

What happened: A developer ran redis-cli --bigkeys on a production Redis instance during peak hours to find memory-heavy keys.

Root cause: The --bigkeys flag performs a full SCAN and evaluates every key’s memory footprint. On a 50GB Redis instance with millions of keys, this command consumed 15 seconds of CPU and blocked all other commands during that window.

Impact: P99 latency spiked from 5ms to 12,000ms. The application saw 200+ connection timeouts. The on-call engineer spent 45 minutes diagnosing why Redis was suddenly unresponsive.

Lesson learned: Never run memory introspection commands (--bigkeys, MEMORY USAGE on unknown keys, KEYS *) on production Redis instances. Use SCAN with COUNT limits for any introspection, and always run it during maintenance windows. For memory analysis, use Redis MEMORY STATS and INFO memory instead.

Common Pitfalls and Anti-Patterns

1. Using KEYS Command in Production

The KEYS command scans all keys and blocks Redis. Never use it in production.

# BAD: KEYS blocks Redis for seconds
all_keys = redis.keys("user:*")

# GOOD: Use SCAN for production
cursor = 0
while True:
    cursor, keys = redis.scan(cursor, match="user:*", count=100)
    process(keys)
    if cursor == 0:
        break

2. Storing Large Values Without Compression

Large values consume memory disproportionately and slow down operations.

# BAD: Storing large uncompressed data
redis.set("page:123", large_html_content)  # 500KB+ per page

# GOOD: Compress large values
import zlib
compressed = zlib.compress(large_html_content.encode())
redis.setex("page:123", 3600, compressed)

3. Not Using Connection Pooling

Each operation creating a new connection adds overhead.

# BAD: New connection each time
def get_user(user_id):
    r = redis.Redis(host='localhost', port=6379)  # Connection every call
    return r.get(f"user:{user_id}")

# GOOD: Reuse connection
pool = redis.ConnectionPool(host='localhost', port=6379, max_connections=50)

def get_user(user_id):
    r = redis.Redis(connection_pool=pool)
    return r.get(f"user:{user_id}")

4. Ignoring Memcached Persistence Limitations

Memcached has no persistence. Data is lost on restart.

# BAD: Assuming Memcached persists data
memcached.set("session:123", session_data)
# ... server restarts ...
session = memcached.get("session:123")  # None - data gone

# GOOD: Design for cold start
session = memcached.get("session:123")
if not session:
    session = load_from_database()  # Always have fallback
    memcached.set("session:123", session, time=3600)

5. Using Redis Single Instance for Everything

Redis single-threaded nature means CPU-bound operations block everything.

# BAD: CPU-heavy operation in Redis
# This blocks all other commands
redis.sort("large-set")  # O(N log N) - blocks Redis

# GOOD: Move CPU work to application
data = redis.lrange("large-list", 0, -1)
sorted_data = sorted(data)  # Application handles sorting

Quick Recap

Redis offers data structures — lists, sets, hashes — that Memcached cannot match. Memcached is more memory-efficient for simple strings. Redis persistence (RDB/AOF) survives restarts; Memcached does not. Redis Cluster provides automatic sharding; Memcached requires client-side sharding. Both support LRU/LFU eviction but Redis LFU is more sophisticated. Redis single-threaded is a feature — no race conditions — but CPU-heavy operations block everything.

Best Practices Summary

Redis: use connection pooling (never a new connection per request); set maxmemory and an eviction policy like allkeys-lru for most caching workloads; never run KEYS, SMEMBERS, or SORT on large sets; use pipelining for batch operations; enable slow log monitoring at 10ms threshold; use hashes for objects instead of JSON strings; set reasonable TTLs; rename dangerous commands in production (rename-command FLUSHDB ""); monitor memory fragmentation (mem_fragmentation_ratio > 1.5 indicates problems); use Lua scripts for atomic multi-step operations.

Memcached: use consistent hashing for sharding with 150-200 virtual nodes per physical node; use consistent key naming with service prefixes (users:123, sessions:abc); store serialized data efficiently with MessagePack or Protocol Buffers instead of JSON for 20-30% smaller payloads; set appropriate chunk size (default 1MB may waste memory for small values); monitor evictions — high rates indicate cache is too small or TTLs are misconfigured; prefer UDP for get operations in read-heavy workloads; use multi-get for batch operations.

Observability Checklist

Security Checklist

  • Enable authentication — Redis 6+ supports ACLs. Memcached supports SASL authentication. Never run without auth in production.
  • Bind to internal IPs onlybind 127.0.0.1 or bind 10.0.0.0/8 to prevent unauthorized access. No public IP exposure.
  • Encrypt in transit — Use TLS for Redis and Memcached if crossing network boundaries. Redis 6+ has native TLS support.
  • Limit commands — Rename dangerous commands: rename-command FLUSHDB "" rename-command CONFIG "" rename-command KEYS "".
  • Set maxmemory — Prevent cache from consuming all available RAM and causing system instability.
  • Use firewall rules — Restrict access to cache ports (6379 for Redis, 11211 for Memcached) to application servers only.

Metrics to Track

Redis:

# Core metrics via Redis INFO
INFO memory  # used_memory, maxmemory, mem_fragmentation_ratio
INFO stats   # total_commands_processed, keyspace_hits, keyspace_misses
INFO replication  # master_link_status, slave_read_only, replication_lag
INFO clients  # connected_clients, blocked_clients

# Calculate hit rate
# hit_rate = keyspace_hits / (keyspace_hits + keyspace_misses)

Memcached:

# Stats command
stats
# Items: curr_items, total_items, evictions
# Memory: bytes, limit_maxbytes
# Hit rate: get_hits, get_misses

# Calculate hit rate
# hit_rate = get_hits / (get_hits + get_misses)

Logs to Capture

import structlog
import time

logger = structlog.get_logger()

class CacheMetrics:
    def __init__(self, redis_client):
        self.redis = redis_client
        self.start_time = time.time()

    def track_operation(self, operation, key, hit=True):
        logger.info("cache_operation",
            operation=operation,
            key=key,
            cache_hit=hit,
            latency_ms=self._measure_latency()
        )

    def log_memory_pressure(self):
        info = self.redis.info('memory')
        used = info['used_memory']
        maxmem = info['maxmemory']

        if maxmem > 0 and used / maxmem > 0.8:
            logger.warning("cache_memory_critical",
                used_mb=used / 1024 / 1024,
                max_mb=maxmem / 1024 / 1024,
                fragmentation=info.get('mem_fragmentation_ratio', 1))

Interview Questions

1. Redis uses more memory per key than Memcached for simple strings. How would you optimize a Redis deployment for a memory-constrained environment?

Memcached wins on raw memory efficiency for simple strings because it has minimal per-key overhead. For Redis in memory-constrained environments, use hashes instead of string serialization — HSET user:123 name Alice email alice@example.com stores all fields in one Redis key with shared overhead, versus one key per field or JSON serialization in a string key. Enable maxmemory-policy allkeys-lru and set maxmemory conservatively. Use MEMORY USAGE command to identify large keys. Consider using ziplist encoding for small hashes and lists to compress memory. For pure string caching where memory is critical, Memcached remains the pragmatic choice.

2. Your Redis instance shows high CPU usage despite moderate request rates. What is likely happening?

Redis is single-threaded, so a single long-running command blocks everything. The slowlog get 10 command reveals which commands are taking >10ms. Common culprits: SORT on large sets, KEYS pattern scans (never use in production), SMEMBERS on large sets, ZRANGEBYSCORE on large sorted sets without LIMIT, or FLUSHDB during peak traffic. For complex operations on large datasets, move the work to the application side — fetch the raw data and process it there. Also check for fork fatigue if using RDB persistence — the fork itself is cheap but if the parent process is CPU-bound, latency spikes occur during fork.

3. What are the trade-offs between Redis RDB snapshots and AOF persistence for a caching workload?

RDB snapshots are point-in-time dumps — compact, fast to restore, but you lose data since the last snapshot if the instance crashes. AOF logs every write operation — better durability, configurable fsync intervals, but larger files and slower writes. For a pure cache where the database is the source of truth, RDB is usually sufficient — if Redis restarts with an empty cache, the application repopulates from the database. Enable AOF only when you need durability guarantees for cached data, or when restart time matters more than storage overhead. The appendfsync everysec setting is a good balance — worst case 1 second of data loss but much faster than always.

4. Memcached is multi-threaded but you observe high CPU and low throughput. What is happening?

Memcached's multi-threaded architecture uses a global lock on the cache for each operation. If your workload performs very small gets and sets, the lock contention overhead exceeds the parallelism benefit. High CPU with low throughput is the signature of lock contention in Memcached. Workarounds: use connection pooling to multiplex connections (more clients means better parallelism), partition your keys across multiple Memcached instances to reduce per-instance lock contention, or switch to Redis where single-threaded execution eliminates lock contention entirely for most workloads. Profile with stats command — look at lock_ratio or wait_ratio if available in your Memcached version.

5. How would you design a rate limiter using Redis? What are the trade-offs of different approaches (token bucket vs sliding window vs fixed window)?

A token bucket rate limiter in Redis uses INCR and EXPIRE: INCR increments a counter on each request, EXPIRE sets a TTL equal to the time window. If the count exceeds the limit, reject the request. Fixed window uses SETEX with key as rate_limit:{window} where window is timestamp rounded to the interval. Sliding window uses a sorted set with timestamps as scores — more accurate but requires ZREMRANGEBYSCORE and ZCARD. Token bucket allows burstiness within limits; fixed window is simpler but has boundary spikes; sliding window is most accurate but most expensive. For distributed rate limiting, Redis atomic operations are essential — Lua scripts ensure check-and-increment is atomic.

6. A Redis replica falls behind the master by 30 seconds during peak traffic. What are the risks and how do you mitigate them?

A 30-second replica lag means any read from the replica returns data that is up to 30 seconds stale. For most use cases this is acceptable; for leaderboards, like counts, or session data it can cause inconsistency. Risks: users might see outdated counts, missing likes, or stale leaderboard positions. Mitigation: monitor replication_backlog_histlen and master_link_down_since_seconds. If lag is caused by network or master load, fix the root cause first. For read-heavy workloads that can tolerate some staleness, use replica lag thresholds in your application — read from primary if lag exceeds your SLA. For consistency-critical reads (like financial data), always read from the primary. Consider read-timeouts on replicas — if the replica cannot keep up, it is better to fail the read than serve stale data.

7. What is the thundering herd problem and how does it affect both Redis and Memcached? How would you prevent it?

The thundering herd problem occurs when a popular cache key expires or a cache becomes empty, and thousands of requests simultaneously try to refresh the same key from the database. Both Redis and Memcached suffer from this because they are typically used as shared caches. Prevention strategies: (1) probabilistic early expiration (XFetch) — randomly refresh keys before they expire based on expected access frequency; (2) distributed locks — only one request refreshes the cache, others wait and retry; (3) cache warming — proactively populate cache before expected traffic spikes; (4) merge responses — if multiple requests arrive for the same key, batch them into one database query and return the result to all. For Memcached specifically, using local in-process L1 cache in front of it absorbs most thundering herd patterns because hot keys stay in process memory.

8. Compare Redis Cluster hashing with Memcached consistent hashing. What are the trade-offs?

Redis Cluster uses hash slot distribution: 16,384 slots calculated as CRC16(key) % 16384. Each master node owns a subset of slots. When you add or remove nodes, Redis migrates slots (typically 1/16th of keys per node). This is automatic and well-designed. Memcached uses consistent hashing with virtual nodes (typically 150-200 per physical node). When you add or remove nodes, only K/N keys are remapped where K is total keys and N is nodes — similar migration cost to Redis Cluster. Key difference: Redis Cluster requires at least 3 master nodes and resharding triggers brief unavailability during slot migration. Memcached consistent hashing is simpler — no special nodes required, just hash ring membership. For Redis: use for complex workloads needing replication, multiple data types, and built-in HA. For Memcached: use when you need simple, stateless sharding and can manage failover at the application layer.

9. How do you choose between Redis Strings and Redis Hashes for storing object data? When does each perform better?

Strings store serialized objects (JSON, msgpack). A single string key holds the entire object. Hashes store field-value pairs directly — each field is a separate key in Redis's internal dict. Choose Strings when: the entire object is always read or written as a whole, you need to store pre-serialized data from external systems, or you want to use string operations like APPEND or INCR on numeric fields. Choose Hashes when: you frequently read or write individual fields (partial access patterns), you want to avoid serialization/deserialization overhead, or you want Redis to manage field expiration independently. Memory: for objects with few fields (< 10), hashes have less overhead because field names and values share dict entry overhead. For large objects with many fields, hashes can be more memory-intensive than JSON in a string because each field is a separate Redis key-value. Benchmark your specific access patterns. Rule of thumb: if you access < 50% of fields at a time, hashes usually win.

10. Your application uses both Redis and Memcached. When would you use each? Design a caching architecture that uses both effectively.

A common pattern is L1 (local in-memory) + L2 (Memcached) + L3 (Redis) + Database. Memcached handles simple string caching for page fragments, rendered HTML, and API responses that benefit from its memory efficiency. Redis handles complex data (sorted sets for leaderboards, lists for queues, hashes for user objects), session storage with TTL, pub/sub for real-time features, and rate limiting with atomic operations. Concretely: use Memcached for cached database query results that are simple key-value at the page level (e.g., product:123 → HTML fragment). Use Redis for anything requiring data structures (like sets for "users who liked this post"), session data with TTL, distributed locks (SETNX), rate limiting counters, and pub/sub channels. The architectural principle: Memcached is a dumb, fast cache for immutable or rarely-changed data. Redis is a data store that happens to cache well. Start with Redis for everything; add Memcached only when memory is demonstrably constrained and profiling shows Memcached is meaningfully more efficient for specific workloads.

11. When would you choose Memcached over Redis?

Choose Memcached when you need pure string key-value caching and memory efficiency is critical. If your data fits naturally as key-value pairs, you do not need atomic counters, data structures, or persistence, and your team prefers operational simplicity, Memcached wins. It is also the right choice when horizontal scaling via consistent hashing is acceptable and you do not need built-in clustering. For everything beyond simple string caching — sorted sets, hashes, pub/sub, Lua scripting, or persistence — use Redis.

12. How does Redis handle cache stampede prevention?

Redis addresses cache stampede (thundering herd) through several mechanisms. Distributed locking via SETNX ensures only one request refreshes a hot key while others wait. Probabilistic early expiration (XFetch) randomly refreshes keys before they expire based on expected access frequency, preventing mass expiration events. WAIT command can be used for read-your-writes consistency. Application-level patterns like merging concurrent requests for the same key into a single database query also help. Memcached lacks built-in stampede prevention — use local in-process L1 cache in front of Memcached to absorb hot key access spikes.

13. What are the trade-offs between write-through and write-behind caching?

Write-through writes synchronously to both cache and database — strong consistency, simple reads, but higher write latency and potential write amplification. Write-behind writes to cache only and async flushes to database — fast writes, handles spikes, but risks data loss if cache fails before database write and requires additional infrastructure (write queue, retry logic). Cache-aside (lazy loading) is the most common pattern: writes go directly to database, cache is invalidated on write; reads populate cache on miss. Best for read-heavy workloads where the database is the source of truth.

14. How do you estimate cache capacity for a given workload?

Estimate by measuring your working set size: total unique keys accessed within a typical traffic window multiplied by average key size. For Redis, account for ~72 bytes per-key overhead plus value size. For Memcached, ~25 bytes per-item overhead plus key and value. Target cache size so your working set fits with 20-30% headroom for traffic spikes. Monitor eviction rates — evictions > 1% of requests indicate cache is too small or TTLs are misconfigured. Use MEMORY USAGE in Redis to identify large keys. For Memcached, stats shows curr_items and evictions. Size for peak + 20%, not average load.

15. What monitoring metrics matter most for a Redis or Memcached deployment?

For Redis: used_memory/maxmemory ratio (alert at 70%), keyspace_hits/keyspace_misses for hit rate, evicted_keys and expired_keys for eviction pressure, replication_backlog_histlen for replica lag, mem_fragmentation_ratio for memory fragmentation, slowlog for commands > 10ms, and connected_clients for connection pool pressure. For Memcached: get_hits/get_misses for hit rate, curr_items and evictions for cache pressure, bytes/limit_maxbytes for memory usage, and wait_ratio if available for lock contention.

16. How does consistent hashing help with cache sharding?

Consistent hashing distributes keys across cache nodes so that adding or removing a node remaps only K/N keys (where K is total keys and N is nodes), minimizing cache misses during topology changes. Memcached uses consistent hashing with virtual nodes (150-200 per physical node) for this purpose. Redis Cluster instead uses hash slots (16,384 total, calculated as CRC16(key) % 16384) and migrates slots when nodes are added or removed. Both approaches limit the blast radius of node additions or failures, but Redis Cluster automates the process while Memcached requires client-side implementation of the consistent hashing ring.

17. What are the failure modes of a distributed cache and how do you mitigate them?

Key failure modes: OOM (cache returns errors, application falls back to DB) — mitigate with maxmemory-policy allkeys-lru and 70% memory alerts. Cold start (cache restarts empty, DB overwhelmed) — mitigate with cache warming before taking nodes out of rotation. Connection pool exhaustion (requests timeout) — size pool appropriately, implement request queuing with timeout. Replica lag (stale reads) — monitor replication_backlog_histlen, read from primary for consistency-critical data. Redis fork blocking (RDB saves pause commands) — schedule RDB during low-traffic windows, use AOF everysec. Single-threaded blocking (long commands block all others) — avoid KEYS, SMEMBERS on large sets in production. Memcached lock contention (high CPU, low throughput under load) — partition across more instances, use connection multiplexing.

18. How does Redis LFU eviction policy work differently from LRU, and what are the specific use cases where LFU outperforms LRU?

LRU (Least Recently Used) evicts based on access recency — the most recently accessed key survives longest. LFU (Least Frequently Used) evicts based on access frequency — the least frequently accessed key is removed first. The difference matters for workloads where data is accessed frequently for a burst period, then becomes cold. With LRU, a single recent access can keep a key alive even if it has not been touched in days. With LFU, a key accessed 1,000 times last week but not this week will be evicted before a key accessed 10 times this week. Use LFU when: your working set changes gradually (popular items stay popular), you want to preserve frequently-accessed data during traffic spikes, or you need to prevent cold data from being retained by one-time access events. Redis implements LFU with LFU_DECAY_TIME (how often to decrement counters) and LFU_INIT_VAL (initial frequency value). Memcached does not support LFU.

19. Describe the trade-offs of using Redis Pipeline versus MGET/MSET versus Lua scripts for batch operations.

Redis Pipeline batches multiple commands into a single network round trip — the client sends N commands, Redis processes them sequentially, client receives N responses. No atomicity guarantee (other commands from other clients can interleave). Best for: improving throughput on bulk reads/writes when each command is independent. MGET/MSET are native batch commands — MGET key1 key2 key3 retrieves multiple keys in one command, which is more efficient than pipelining individual GETs because Redis processes it internally as a single operation. Lua scripts are atomic — Redis executes the entire script without interleaving other commands, making them safe for read-check-write patterns. Lua scripts have startup overhead (script compilation) and cannot use blocking commands inside them. Trade-off summary: pipeline for throughput on independent ops, MGET/MSET for native batch efficiency, Lua for atomic multi-step operations.

20. Your team is considering moving from Memcached to Redis. What is your decision framework and what risks do you identify during migration?

Decision framework: start with Redis for new projects. If the team has operational experience with Memcached and the workload is purely simple string key-value, stay with Memcached. If you need data structures (sets, sorted sets, hashes, lists), atomic counters, pub/sub, persistence, or built-in clustering, move to Redis. Migration risks: data loss during transition if both caches are running simultaneously (cache keys diverge); increased operational complexity — Redis requires monitoring for fork times, AOF/RDB trade-offs, and memory fragmentation; connection pool sizing is different — Memcached multi-threaded model handles concurrency differently than Redis single-threaded model; application code changes — replacing memcached.get/set with redis.hgetall or redis.lrange is not a drop-in replacement. Mitigation: run both in parallel during transition, use feature flags to route traffic, implement thorough testing before cutting over, and plan for 2x operational monitoring during the transition period.

Further Reading

Official Documentation

External Resources

Performance Tuning


Conclusion

Memcached is simpler and more memory-efficient for pure string caching. Redis is more capable. For basic caching, they are comparable. But Redis’s data structures unlock patterns that would be painful or impossible with Memcached.

I default to Redis for new projects. The operational simplicity of having one system for caching, sessions, pub/sub, and rate limiting usually beats the memory efficiency gains of Memcached.

That said, if you are caching primarily string data and memory is tight, Memcached still earns its place.

Category

Related Posts

Key-Value Stores: Redis and DynamoDB Patterns

Learn Redis and DynamoDB key-value patterns for caching, sessions, leaderboards, TTL eviction policies, and storage tradeoffs.

#database #nosql #key-value-store

Caching Strategies: A Practical Guide

Learn the main caching patterns — cache-aside, write-through, write-behind, and refresh-ahead — plus how to pick TTLs, invalidate stale data, and distribute caches across nodes.

#caching #redis #distributed-systems

Cache Stampede Prevention: Protecting Your Cache

Learn how single-flight, request coalescing, and probabilistic early expiration prevent cache stampedes that can overwhelm your database.

#cache #cache-stampede #performance