How to replicate data across regions
· Category: System Design
Short answer
Cross-region replication provides disaster recovery, low-latency reads, and compliance by copying data to geographically separated locations.
Steps
- Choose synchronous replication for strong consistency at the cost of write latency.
- Choose asynchronous replication for higher performance with temporary inconsistency.
- Deploy read replicas in target regions to serve local traffic.
- Implement failover procedures with automatic promotion of standby replicas.
- Monitor replication lag and alert when it exceeds acceptable thresholds.
Tips
- Use quorum writes to balance consistency and latency in wide-area networks.
- Encrypt replication traffic to protect data in transit between regions.
- Test failover regularly to ensure recovery procedures work.
- Comply with data residency laws by controlling which regions store specific data.
Common issues
- Replication lag causing stale reads in remote regions.
- Split-brain scenarios where two regions accept conflicting writes.
- High network costs for replicating large volumes of data continuously.
- Schema changes breaking replication streams.
Example
# Consistent hashing for service discovery
import hashlib
def get_node(key, nodes):
hash_val = int(hashlib.md5(key.encode()).hexdigest(), 16)
return nodes[hash_val % len(nodes)]
node = get_node('user-123', ['node-a', 'node-b', 'node-c'])
This snippet implements consistent hashing to distribute keys across nodes, a foundational technique in scalable distributed systems.