mysql database official logo

Managing Hierarchical Data in MySQL with Recursive CTEs

Handling Nested Data Without the Headache

Storing hierarchical data is a common challenge in database design. Whether you are building a multi-level navigation menu, a comment thread, or an organizational chart, you need a way to represent parent-child relationships. Traditionally, MySQL developers relied on the Adjacency List model, but querying deep trees often required complex self-joins or inefficient application-side processing.

Since MySQL 8.0, Recursive Common Table Expressions (CTEs) have changed the game. They allow you to traverse deep hierarchies with a single, readable query. In this guide, we will look at how to implement and query hierarchical structures efficiently.

The Setup: The Adjacency List

The simplest way to store a tree is the Adjacency List, where each row contains a reference to its parent. Let's create a table for a product category system:

CREATE TABLE categories (
    id INT PRIMARY KEY,
    name VARCHAR(100),
    parent_id INT,
    FOREIGN KEY (parent_id) REFERENCES categories(id)
);

INSERT INTO categories VALUES 
(1, 'Electronics', NULL),
(2, 'Laptops', 1),
(3, 'Gaming Laptops', 2),
(4, 'Workstation Laptops', 2),
(5, 'Smartphones', 1),
(6, 'Android', 5);

Querying the Tree with Recursive CTEs

If you wanted to find the full path for 'Gaming Laptops', you would previously need to know the depth of the tree beforehand. With a Recursive CTE, MySQL handles the recursion for you. A recursive CTE consists of two parts: the anchor member (the starting point) and the recursive member (the logic that finds the next level).

Here is how to generate a full breadcrumb path for every category:

WITH RECURSIVE category_path (id, name, path) AS (
    -- Anchor member: Start with top-level nodes
    SELECT id, name, CAST(name AS CHAR(255))
    FROM categories
    WHERE parent_id IS NULL
    
    UNION ALL
    
    -- Recursive member: Join the CTE back to the table
    SELECT c.id, c.name, CONCAT(cp.path, ' > ', c.name)
    FROM category_path cp
    JOIN categories c ON cp.id = c.parent_id
)
SELECT * FROM category_path ORDER BY path;

Why This Matters

Using Recursive CTEs offers several advantages over older methods like Nested Sets or Materialized Paths:

  • Data Integrity: You only need a simple parent_id column, making inserts and moves very easy.
  • Readability: The logic of the traversal is contained within the SQL query rather than scattered across application code.
  • Performance: MySQL optimizes these recursions, making them significantly faster than multiple individual queries for each level.

Calculating Depth

Sometimes you need to know how deep an item is in the hierarchy (e.g., for styling indentation). You can easily add a level counter to your CTE:

WITH RECURSIVE category_depth AS (
    SELECT id, name, 0 AS level
    FROM categories
    WHERE parent_id IS NULL
    
    UNION ALL
    
    SELECT c.id, c.name, cd.level + 1
    FROM category_depth cd
    JOIN categories c ON cd.id = c.parent_id
)
SELECT * FROM category_depth;

This approach allows you to scale your hierarchy to any depth without changing your database schema or writing complex loops in your backend language. It is a robust solution for any developer working with relational data that isn't strictly flat.