How the circuit breaker pattern prevents failures
· Category: System Design
Short answer
The circuit breaker pattern stops repeated calls to a failing service, giving it time to recover and preventing resource exhaustion in callers.
Steps
- Monitor calls to a dependency for failures and latency.
- Open the circuit when failures exceed a configured threshold.
- Return a fallback response or fail fast while the circuit is open.
- After a timeout, allow a limited number of test requests through.
- Close the circuit if test requests succeed, resuming normal traffic.
Tips
- Combine circuit breakers with retries and bulkheads for defense in depth.
- Log state transitions to analyze failure patterns.
- Choose thresholds based on historical latency distributions.
- Expose circuit breaker metrics in health dashboards.
Common issues
- Thresholds set too high allowing cascading failures.
- Thresholds set too low causing unnecessary service rejection.
- Missing fallback logic leading to user-facing errors.
- Not adapting thresholds as system behavior evolves.
Example
from circuitbreaker import circuit
@circuit(failure_threshold=5, recovery_timeout=60)
def call_payment_gateway():
return requests.post('https://payments.example.com/charge')
This decorator wraps an external call with a circuit breaker that opens after five failures and tries again after sixty seconds.
Additional context
Applying these principles consistently across projects leads to more maintainable systems, clearer team communication, and better outcomes for end users. Regular review and refinement of practices ensure continuous improvement.