What is denormalization and when to use it
· Category: SQL & Databases
Short answer
Denormalization is the deliberate introduction of redundant data into a normalized schema to reduce joins and improve read performance.
How it works
Instead of storing customer_name only in the customers table, you might duplicate it in the orders table to avoid a join on order reports. This trades write complexity and storage for read speed.
When to use it
- Read-heavy workloads where join costs are unacceptable.
- Reporting and analytics databases where data changes infrequently.
- When using caching or materialized views is insufficient.
Tips
- Maintain denormalized data with triggers, application logic, or batch ETL processes.
- Document the redundancy and the synchronization strategy clearly.
Common issues
- Updates become more complex because multiple copies must stay synchronized.
- Inconsistent denormalized data causes misleading reports if synchronization fails.r