How caching improves system performance
· Category: System Design
Short answer
Caching stores copies of frequently accessed data in fast storage, reducing latency and backend load.
Steps
- Identify read-heavy data with low update frequency as caching candidates.
- Choose a caching pattern: cache-aside, read-through, write-through, or write-behind.
- Set TTL values that balance freshness with hit rates.
- Implement cache invalidation on data updates to prevent stale responses.
- Monitor hit rates, eviction rates, and memory usage.
Tips
- Use Least Recently Used eviction when memory is constrained.
- Apply consistent hashing to distribute cache keys across a cluster.
- Warm caches after restarts to avoid cold-start latency spikes.
- Compress large cached objects to maximize memory efficiency.
Common issues
- Cache stampede when many requests simultaneously miss and hit the backend.
- Stale data served due to missing or delayed invalidation.
- Thundering herd on expiration without probabilistic early refresh.
- Memory exhaustion from unbounded cache growth.
Example
# Consistent hashing for service discovery
import hashlib
def get_node(key, nodes):
hash_val = int(hashlib.md5(key.encode()).hexdigest(), 16)
return nodes[hash_val % len(nodes)]
node = get_node('user-123', ['node-a', 'node-b', 'node-c'])
This snippet implements consistent hashing to distribute keys across nodes, a foundational technique in scalable distributed systems.