Heap Sort, Explained

Heap sort is a sorting algorithm that sorts array elements using a heap data structure, which is similar to a binary tree data structure. The heap sort algorithm is essential to the data preparation process and can be used to implement priority queues, which sort items in a data structure based on assigned priority.

Understanding heap sort provides numerous benefits in domains such as data compression, path planning, robotics and automation.

What Is Heap Sort?

Heap sort is a sorting algorithm that organizes elements in an array using a binary heap. In a max heap, the algorithm repeatedly swaps the largest element (root node) with the last node from the heap, and re-inserts the new root into the array into ascending order.

This article will unpack the definition of the heap sort algorithm, including all its operations and how to implement it in Python.

More on Data ScienceQuicksort Algorithm: An Overview

Binary Trees and Heaps: What Are They?

Heaps are a variation of binary trees with additional structural constraints that make the utilization of heaps different from that of a typical binary tree, notably in heaps arranged in a particular manner and pattern. We will explore this and more later in the article. For now, we can understand heaps as binary trees with some special rules on top of them, which are:

Complete binary trees: A binary tree in which all levels are completely filled except the last one. And if the last level is filled partially, it should be filled from left to right.
Max or min: A heap should be either a max or a min heap. In a max heap, every parent, including the root, should be greater than or equal to its child nodes, so the largest value becomes the root. In a min heap, each parent node is less than or equal to its child nodes, so the smallest value of a node is at the root of the tree.

In theory, we could use both a max or a min heap to implement heap sort, but for demonstration purposes, we’ll only implement it using max heap. You could always play around with the code and try implementing the algorithm with a min heap as well.

More on Data ScienceHow to Implement Binary Search in Python

How the Heap Sort Algorithm Works

A max heap provides an efficient way to retrieve the maximum elements from an array. The root of a max heap is always the largest element since all child nodes must have smaller values. This means retrieving the maximum element is as simple as popping the root node off the top of the heap. This process can be repeated until all elements have been removed, resulting in a sorted list.

The heap sort algorithm works in six steps. Here’s what you need to know.

6 Steps of a Heap Sort Algorithm

Transform the array into a binary tree by inserting each element as a node in a breadth-first manner.
Convert the binary tree into a max heap, ensuring that all parent nodes are greater than or equal to their child nodes.
Swap the root node — the largest element — with the last element in the heap.
Call the heapify() function to restore the max heap property.
Repeat steps 3 and 4 until the heap is sorted, and exclude the last element from the heap on each iteration.
After each swap and heapify() call, ensure that the max heap property is satisfied.

1. Transform the Array Into a Binary Tree

Suppose we have the following unsorted array of numbers to start with, and we want to sort them using heap sort:

[12, 11, 31, 3, 5, 7, 9]

To transform the array into a binary tree, we can start by inserting the first element of the array as a node into the tree and continuing this process in a breadth-first manner until all elements of the array have been added. This will result in a binary tree with all of the elements of the array, as shown below:

An array sorted into a tree structure illustration — An array sorted into a tree structure . | Image: Richmond Alake

2. Convert the Binary Tree Into a Max Heap

After inserting all of the elements of the array into a binary tree, the next step is to convert this binary tree into a max heap. In a max heap, all parent nodes must have values that are greater than or equal to the values of their children. This will ensure that the highest value is always at the root of the tree. For the tree above, this means swapping node 12 and node 31 positions in the tree to satisfy the requirements for a max-heap tree.

Tree converted into Max-Heap Tree — Tree converted into max-heap tree . | Image: Richmond Alake

max heap = [31, 11, 12, 3, 5, 7, 9]

To verify that the above max heap satisfies all of the necessary properties, you can double-check that it is a complete binary tree and that there are no parent nodes with values smaller than their child nodes. If these conditions are met, the max heap is ready to be used in the sorting algorithm.

3. Swap the Root Node With the Last Element in the Heap

The next step in the sorting process is to swap the root node, which contains the largest element, with the last node in a heap. In other words, you’re swapping the element in the first position of the max-heap array with the element in the last position of the max-heap array. Whatever method you choose to understand this step, 31 ends at the end of the array, and nine ends at the first position of the array. This process is depicted in the image below.

Swapping values at last and first position . — Swapping values at last and first position. | Image: Richmond Alake

You might have noticed that the last element in the array, holding the value 31, is highlighted in red. This is because, in the next step, and in future iterations, we will be omitting the last value because it’s in a sorted position. Therefore, we move forward with the following array into the next step:

[9, 11, 12, 3, 4, 7]

Now, we will transform the array into a tree, then the tree into a max heap. This step is depicted in the image below:

Binary Tree to Max Heap — Binary tree to max heap. | Image: Richmond Alake

After swapping the root node with the last element in the heap, the visualized heap will now have one less element. This is because the largest element has been placed at the end of the array and will be excluded from future iterations. If it were not excluded, the sorting algorithm would never finish. Therefore, every time the root node is swapped, the next iteration should exclude the last element in the heap. In the scenario depicted above, we observe one less node in both the array’s binary tree and max heap representation.

4. Call the heapify() Function

Let’s now refer to the process of converting the tree or array into a max heap as heapify. This will help with naming the function in this article’s implementation section.

It’s also essential to observe that the tree structure may no longer satisfy the requirements of a max heap after the root node has been swapped in the example given. The heapify() function should be called again to restore the max heap property. This will result in the heap being rearranged as shown:

[12, 11, 9, 3, 5, 7]

And again, we swap the values in the first and last position of the max-heap array representation.

More on HeapifyHow to Heapify a Heap Tree in C++

5. Repeat Steps 3 and 4 Until the Heap Is Sorted

You will observe that ‘12’ has now been highlighted red, which implies it has been omitted from future steps. Another observation is that the elements in red are sorted in ascending order. Once you complete the steps, all elements should be highlighted in red, indicating omission from the steps and also sorted in ascending order.

The image below depicts the complete iterations of the sorting process and steps.

Heap Sort Step . | Image: Richmond Alake

Heap Sort Step continued — Heap sort step. | Image: Richmond Alake

6. Ensure the Max Heap Property Is Satisfied

The final sorted array:

[3, 5, 7, 9, 11, 12, 31]

Now that you thoroughly understand the heap sort algorithm, we can implement it in Python. There are a few different ways to do this, but the basic steps will be the same as those outlined in the previous explanations — transform the array into a binary tree, convert the binary tree into a max heap, swap the root node with the last element in the heap and repeat the process until the heap is sorted.

More on Data ScienceBubble Sort Time Complexity and Algorithm Explained

A tutorial on how to implement heap sort in Python. | Video: CodeSavant

How to Implement Heap Sort in Python

This section explains how to implement the heap sort algorithm in Python. It requires some familiarity with Python, mainly around topics such as for loops, recursion, function definitions and if statements. I have attempted to explain each significant portion of the code in detail, but I recommend taking some time and re-implementing the provided code yourself.

1. Create a Max Heap Array

To implement the heap sort algorithm, we first initialize an unsorted array assigned to a variable named arr.

# Initialise an array
arr = [12, 11, 31, 3, 5, 7, 9]

We define a function called heap sort() that takes a list as an argument when called. This function is where we pass the unsorted array into, expecting it to return a sorted array after its internal operations are completed.

def heap sort(arr):

Within the heap sort array, we store the array's length as a variable to be used in upcoming for loop operations.

The first for loop implemented will iterate through the array, creating a max heap from the input array. This loop will iterate through the array, starting at the last element and traversing through the array in reverse or backward until it reaches the final element. Essentially, the array is being traversed in a depth-first manner.

As we traverse through the array in the for loop, the heapify function is called. The heapify function is responsible for creating max heaps. The heapify function will be defined and explained in detail in later steps. For now, all we need to know is the three arguments passed into the heapify function:

arr: This is the entire array that’s required to be converted to a max heap.
arr_len: A reference to the length of the array
i: Refers to the index of the tree’s root

The heap sort function is updated as shown below:

def heap sort(arr):
  # Assign the length of the array to a variable
  arr_len = len(arr)
 
  # Loop through the length of the array to ensure a depth wise...
  for i in range(arr_len, -1, -1):
      # Create maxheaps from the array
      heapify(arr, arr_len, i)

2. Sort the Array

After creating a max heap from the list or array, it is required that we sort the array. A second for loop will be used to perform the sorting operation. But remember that the sorting operation ignores the last element at each step, so the iteration starts at the second to last element, and again, we traverse through the array in reverse by decrementing the iterator by one. This consideration will be accommodated in the implementation of the second for loop.

Within the second loop, the swapping operation between the first element of the heap structure and the current element is the swapping process we observed in the intuition section and images earlier.

And just like in the images, we repeat the process again until we’ve traversed through the entirety of the array.

The heap sort function is updated as shown below:

def heap sort(arr):
  # Assign the length of the array to a variable
  arr_len = len(arr)
 
  # Loop through the length of the array to ensure a depth wise...
  for i in range(arr_len, -1, -1):
      # Create maxheaps from the array
      heapify(arr, arr_len, i)
     
  for i in range(arr_len-1, 0, -1):
      # Swapping the first element with the current element
      arr[i], arr[0] = arr[0], arr[i]
      heapify(arr, i, 0)
     
  return arr

3. Implement the Heapify Function

Now let’s implement the heapify function. The heapify function is called twice in the ‘heap sort’ function and can be defined as a helper function, as it will perform the bulk of the algorithm operation, which is creating max heaps.

The heapify function will take in the three parameters — arr, n and i.

def heapify(arr, n, i):

Within the heapify function, a reference to the root of the max heap or the index of the largest value is required. This will be assigned to the variable called largest.

A parent node in a tree contains either right or left child nodes or both. We will require variables referencing the values in the left and right child nodes. These variables will be named l and r. In an array representing a tree structure, to obtain a child node, you multiply the parent index by two and increment the value by one for the left child node and two for the right child node.

Let’s update the implementation of the heapify function to reflect the steps above.

def heapify(arr, n, i):
   # Initialize largest as root
   largest = i
   # left child = 2*i + 1
   l = 2 * i + 1
   # right child = 2*i + 2
   r = 2 * i + 2

4. Check the Parent Node

In the next steps, we must check to see if the parent node has a left or right child node and also conduct the swapping operation if the value in the left or right child nodes is greater than the parent node. Remember, we are building a max heap, where a parent node has to have a value greater than its children. If the parent node isn’t greater, then we update the largest variable to point to the index of the child node.

Let’s update the implementation of the heapify function to reflect the steps above.

def heapify(arr, n, i):
  # Initialize largest as root
  largest = i
  # left child = 2*i + 1
  l = 2 * i + 1
  # right child = 2*i + 2
  r = 2 * i + 2

  # If left child is larger than root
  if l < n and arr[i] < arr[l]:
      largest = l

  # If right child is larger than largest so far
  if r < n and arr[largest] < arr[r]:
      largest = r

5. Confirm That the Largest Variable Updated

We need another final check to see if an update was made to the largest variable holding the index of the sub-tree. If the largest variable was updated within the previous if statements, then it is safe to assume that the parent node was not the largest in the tree, and a swapping operation is required.

We have a reference to the index of the root of the tree, which is i. If largest was updated, the value assigned to i will remain the same. Hence, a non-equality check in an if statement that evaluates to true if both i and largest are not equal will suffice for an update check for the largest variable.

Within this condition, the swapping operation is performed by using the index i and largest.

And due to the fact that the root node was not the largest of the tree, it is required to create a max heap of the array once again. This step is achieved by recursively calling heapify() with the index of the largest element (represented by the variable largest) as the new root.

Let’s update the implementation of the heapify function to reflect the steps above.

def heapify(arr, n, i):
  # Initialize largest as root
  largest = i
  # left child = 2*i + 1
  l = 2 * i + 1
  # right child = 2*i + 2
  r = 2 * i + 2

  # If left child is larger than root
  if l < n and arr[i] < arr[l]:
      largest = l

  # If right child is larger than largest so far
  if r < n and arr[largest] < arr[r]:
      largest = r

  # If largest is not root
  if largest != i:
      # swap with root
      arr[i], arr[largest] = arr[largest], arr[i]
      # Recursively heapify the affected sub-tree
      heapify(arr, n, largest)

6. Print the Result

Finally, to ensure our algorithm works as expected, print the result of the unsorted array passed into the heap sort function.

print(heap sort(arr))

More on Data ScienceHow Do You Use Data Structures and Algorithms in Python?

Understanding Heap Sort

Understanding the heap sort algorithm may take a few passes through both the intuition and implementation sections. The result of understanding the implementation is an appreciation for in-place sorting algorithms and data structures.

Try to understand the implementation and operations within the helper function heapify as this is where most of the key operations occur, in this case, converting the data into a max heap.

Frequently Asked Questions

What is heap sort?

Heap sort is a comparison-based sorting algorithm that sorts data elements using a heap, a binary tree-like data structure.

The algorithm works by converting an array into a binary tree, and then into a max (or min) heap. From here, the root node is swapped with the last node in the heap, and the new last node is removed from the heap and sorted into the array. The heap is then heapified to keep it as a max heap. This is repeated until the elements in the array are re-sorted based on the algorithm.

What is heap sort’s time complexity?

The heap sort algorithm’s best, worst and average time complexities are all the same — O(n*log(n)). The time it takes to sort the array increases logarithmically with the size of the array. However, some optimized versions of the algorithm can provide a best-case time complexity of O(n) by checking if the array is already sorted and, in that case, making the algorithm run in linear time.

Why is heap sort called an ‘in-place’ algorithm?

Heap sort is called an in-place algorithm because it does not require extra memory space to sort. It uses the same array for both the elements’ storage and the sorting process. This is done by rearranging the elements of the array in place to satisfy the max-heap property, which is used to sort the array.

What are some use cases for heap sort?

Heap sort is widely used in many practical scenarios. Some of the most common use cases of heap sort include data compression techniques, Dijkstra’s algorithm for finding the shortest path in a graph and finding extreme values such as the largest or smallest elements in a large dataset.

When is heap sort used?

Heap sort can be leveraged to quickly find the largest or smaller element in a list or large dataset. Heap sort performs well when large data sets are required to be sorted efficiently without the cost of expending additional memory.

Why use heap instead of sorting?

Heaps can take less time complexity than a sorted array to build, and can help sort an array faster than manual sorting (especially in large datasets).