Recursive SQL Expression Visually Explained 

Recursive SQL expressions allow you to reference previous results in a hierarchical data set in order to extract information. Here’s how it works.  

Written by Denis Lukichev
Published on Sep. 14, 2022
Image: Shutterstock / Built In
Image: Shutterstock / Built In
Brand Studio Logo

Recursion in SQL? But why? Oh, there are many uses for that. It’s common to store hierarchical data in SQL and recursive queries are a convenient way to extract information from such graphs. Some common examples of hierarchical data you might encounter include organizational structure, application menu structure, a set of tasks with sub-tasks in the project, links between web pages and a breakdown of an equipment module into parts and sub-parts. 

What Is a Recursive SQL Common Table Expression (CTE)?

A recursive SQL common table expression (CTE) is a query that continuously references a previous result until it returns an empty result. It’s best used as a convenient way to extract information from hierarchical data. It’s achieved using a CTE, which in SQL is known as a “with” statement. This allows you to name the result and reference it within other queries later.    

The post will not go into the details of those many use cases, rather it will examine two hypothetical examples designed to help you understand the concept — the simplest possible case of recursion in numbers and querying data from the family tree.

 

Recursive SQL: What Does It Mean and How Does It Work?

Let's think about queries as a function in the sense that a function takes an input and produces an output. Queries operate on relations, or one could say tables. We will denote those as Rn. A query takes three relations — R1, R2 and R3 — and produces an output R. Simple enough.

recursive sql image representing query between R1, R2 and R3
A visual representation of how a recursive query works in SQL. | Image: Denis Lukichev

SQL example: SELECT FROM R1, R2, R3 WHERE.

A query can take something and produce nothing:

recursive sql visual query taking something and producing nothing

A visual representation of a query taking something and producing nothing. | Image: Denis Lukichev

SQL example: SELECT FROM R1 WHERE 1 = 2

It can take nothing and produce something:

recursive sql query visual taking nothing and producing something
A visual representation of a query taking nothing and producing something. | Image: Denis Lukichev

SQL example: SELECT 1.

Or it can take nothing and produce nothing.

recursive sql query take nothing and produce nothing
A visual representation of a query taking nothing and producing nothing. | Image: Denis Lukichev

SELECT 1 WHERE 1 = 2

More on DataUnderstanding Boxplots

 

How to Write a Recursive Common Table Expression (CTE) in SQL 

Recursion is achieved using a WITH statement, which in SQL jargon is called a common table expression (CTE). It allows you to name the result and reference it within other queries later.

recursive sql code
Naming the result and referencing it within other queries

Here is a sample.

WITH R AS (SELECT 1 AS n)
SELECT n + 1 FROM R

Query (SELECT 1 AS n) now has the name R. We refer to that name in SELECT n + 1 FROM RR is a single row, single column table containing number one. The result of the whole expression is number two.

The recursive version of the WITH statement references itself while computing the output.

recursive sql statement
Using the recursive common table expression. | Screenshot: Denis Lukichev

For the recursion to work, we need to start with something and decide when the recursion should stop. To achieve this, a recursive with statement typically has the following form:

recursive sql code statement
The recursive with statement in SQL. | Image: Denis Lukichev

It’s important to note that the base query doesn’t involve R, however, the recursive query does reference R. At first glance, it seems like an infinite loop. To compute R, we need to compute R. But there’s a catch. R doesn’t reference itself, it just references the previous result. And when the previous result is an empty table, the recursion stops. It might help to think of it as an iteration rather than a recursion.

Here’s what happens. The base query executes first, taking whatever it needs to compute the result R0. The second recursive query is executed taking R0 as input — that is R references R0 in the recursive query when first executed. The recursive query produces the result R1, and that is what R will reference at the next invocation, and so on until a recursive query returns an empty result. At that point, all intermediate results are combined together.

recursive sql algorithm flow chart illustration
Recursive query execution algorithm flow chart. | Image: Denis Lukichev
recursive sql execution sequence illustration
Recursive query execution sequence. | Image: Denis Lukichev

 

Recursive SQL Examples

If that’s too abstract, let's look at a couple concrete examples.

 

Example 1: Count Up Until Three

The first example we’ll explore is count until three.

recursive sql example count until three with statement
Running a recursive with statement to execute count until three command. | Screenshot: Denis Lukichev

The base query returns number 1, the recursive query takes it under the countUp name and produces number 2, which is the input for the next recursive call. When the recursive query returns an empty table n >= 3, the results from the calls are stacked together.

Recursive sql count until three illustrated results
Illustration of the results from the call stacked together. | Image: Denis Lukichev

Counting up like that, however, can only go that far. There is a limit for recursion. It defaults to 100, but it could be extended with the MAXRECURSION option (Microsoft SQL server specific). In practice, however, it could be a bad idea to crank the recursion limit up. Graphs might have cycles and limited recursion depth can be a good defense mechanism to prevent a poorly behaving query: OPTION (MAXRECURSION 200).

More on SQLA Guide to Resolving Data Divergence in SQL

 

Example 2: Finding Ancestors

Let’s look at another example, finding a person’s ancestors.

recursive sql ancestral data table
Using recursion to find the ancestors of a person. | Image: Denis Lukichev
recursive sql with statement to find ancestors
Recursive common table expressoin to find the ancestors of a person. | Screenshot: Denis Lukichev

The base query finds Frank’s parent, Mary, and then the recursive query takes this result under the Ancestor name and finds Mary’s parents, who are Dave and Eve. This process continues until we can’t find any more parents.

recursive sql illustration of the recursion algorithm results
Illustration of the results from the recursion to find the ancestors of a person. | Image: Denis Lukichev
Recursive sql ancestors data table with birth years
A data table that includes the birth year to find the parents of a person. | Image: Denis Lukichev

This tree traversal query could be the basis from which you augment the query with some other information of interest. For example, having a birth year in the table, we can calculate how old the parent was when the child was born. Our next query can do exactly that, along with showing lineages. To do that, it traverses the tree from top to bottom. parentAge is zero in the first row because we don’t know when Alice was born from the data we have.

recursive sql with statement to find ancestors with birth year
Running a recursion to find the birth year of a person and their ancestors. | Screenshot: Denis Lukichev
recursive SQL lineage table with birth year
Table representing the results from the recursion to find the birth year and ancestors of a person. | Image: Denis Lukichev

The takeaway is that the recursive query references the result of our base query or a previous invocation of the recursive query. The chain stops when the recursive query returns an empty table.

Explore Job Matches.