· SQL Guroo · SQL Tutorial  · 2 min read

Data Normalization and Denormalization in SQL - When and Why?

Learn about database normalization and denormalization with simple examples.

Learn about database normalization and denormalization with simple examples.

Introduction

Normalization organizes data to reduce redundancy, while denormalization adds redundancy for performance. Think of normalization as organizing items into specific drawers, and denormalization as keeping copies where you frequently need them.

Why It Matters?

Understanding normalization helps:

  • Reduce data redundancy
  • Maintain data integrity
  • Optimize storage
  • Improve or speed up queries

Main Content: Understanding Normalization

Let’s look at examples of normalized and denormalized structures:

Example 1: Unnormalized Data

-- Unnormalized table (bad design)
CREATE TABLE orders_unnormalized (
    order_id INT,
    customer_name VARCHAR(50),
    customer_email VARCHAR(50),
    customer_address VARCHAR(100),
    product_name VARCHAR(50),
    product_category VARCHAR(50),
    product_price DECIMAL(10,2),
    order_date DATE,
    quantity INT
);

-- Sample data
INSERT INTO orders_unnormalized VALUES
(1, 'Alice Smith', 'alice@email.com', '123 Main St', 'Laptop', 'Electronics', 1200.00, '2024-01-15', 1),
(2, 'Alice Smith', 'alice@email.com', '123 Main St', 'Mouse', 'Electronics', 25.00, '2024-01-15', 2),
(3, 'Bob Jones', 'bob@email.com', '456 Oak St', 'Laptop', 'Electronics', 1200.00, '2024-01-16', 1);

Example 2: Normalized Structure

-- Normalized tables (good design)
CREATE TABLE customers (
    customer_id INT PRIMARY KEY,
    customer_name VARCHAR(50),
    customer_email VARCHAR(50),
    customer_address VARCHAR(100)
);

CREATE TABLE products (
    product_id INT PRIMARY KEY,
    product_name VARCHAR(50),
    product_category VARCHAR(50),
    product_price DECIMAL(10,2)
);

CREATE TABLE orders (
    order_id INT PRIMARY KEY,
    customer_id INT,
    order_date DATE,
    FOREIGN KEY (customer_id) REFERENCES customers(customer_id)
);

CREATE TABLE order_items (
    order_id INT,
    product_id INT,
    quantity INT,
    FOREIGN KEY (order_id) REFERENCES orders(order_id),
    FOREIGN KEY (product_id) REFERENCES products(product_id)
);

Example 3: Querying Normalized vs Denormalized

Normalized Query (multiple joins):

SELECT 
    o.order_id,
    c.customer_name,
    p.product_name,
    oi.quantity,
    p.product_price,
    (oi.quantity * p.product_price) as total_amount
FROM orders o
JOIN customers c ON o.customer_id = c.customer_id
JOIN order_items oi ON o.order_id = oi.order_id
JOIN products p ON oi.product_id = p.product_id;

Denormalized Query (simpler but redundant data):

SELECT 
    order_id,
    customer_name,
    product_name,
    quantity,
    product_price,
    (quantity * product_price) as total_amount
FROM orders_unnormalized;

Benefits and Tradeoffs

Normalization Benefits:

  1. Data Integrity

    • No redundant data
    • Easier updates
    • Consistent information
    • Reduced anomalies
  2. Storage Efficiency

    • Less duplicate data
    • Optimized storage
    • Better data organization
    • Easier maintenance
  3. Update Efficiency

    • Single point of update
    • No update anomalies
    • Consistent changes
    • Better data quality

Denormalization Benefits:

  1. Query Performance

    • Fewer joins needed
    • Faster read operations
    • Simpler queries
    • Better response time
  2. Reporting Efficiency

    • Direct access to data
    • Simpler report queries
    • Faster aggregations
    • Better user experience

Best Practices

  1. When to Normalize

    • OLTP systems
    • Data entry applications
    • When data consistency is critical
    • When storage is a concern
  2. When to Denormalize

    • Data warehouses
    • Reporting systems
    • Read-heavy applications
    • When performance is critical
  3. Design Guidelines

    • Start with normalized design
    • Denormalize strategically
    • Document design decisions
    • Consider data volume
  4. Common Approaches

    • Hybrid designs
    • Materialized views
    • Cached denormalized data
    • Regular synchronization
  5. Things to Remember

    • Balance needs carefully
    • Consider maintenance costs
    • Plan for growth
    • Monitor performance

Remember:

  • Normalize for consistency
  • Denormalize for performance
  • Balance based on needs
  • Document your choices

Common Mistakes to Avoid:

  1. Over-normalization
  2. Unnecessary denormalization
  3. Ignoring maintenance costs
  4. Not planning for scale

Happy database designing! 🚀

Back to Blog

Related Posts

View All Posts »