We quickly reached the third post of this series about sorting algorithms, today I’m going to show you Heapsort. This algorithm was invented in the early 60s by a man called J. W. J. Williams which was a Welsh Engineer that became famous for inventing this algorithm and the heap data structure.
For this post, I need to teach you a little bit of theory about the heap data structure and then I can proceed with the sorting algorithm itself. Let’s dive into it.
Heap data structure
A heap is representable as a binary tree (in which every node has at most 2 children). Let’s take this array for example:
Its representation as a binary tree is the following:
So what does it mean to ‘build a heap’?
It means to transform the starting array so that its representation as a binary tree satisfies a property. Before telling you about this property I’d like to anticipate that there are two different kinds of heap:
- max-heap: every element is ≤ than its parent (we find the biggest value at the root);
- min-heap: every element is ≥ than its parent (we find the lowest value at the root).
Starting from the previous array our goal is to reorder it so that we have a max-heap; here is an example
How to turn an array into a max-heap
The first thing we have to keep in mind is that all the work is done bottom-up (meaning from the leaves to the root). We are going to use a function called Max-Heapify.
What does it mean? We are going to create max-heaps from the bottom and then the process is looped until the full tree is a max-heap.
The goal of the Max-Heapify function is to turn a node that has children which are the roots of a Max-Heap so that at the end we create a new Max-Heap. It’s a little bit confusing, I know. Let’s look at this GIF:
In this example, we start from the second element of the array (which is a 4 originally). This element has two children which are both roots of a max-heap (14 has 2 and 8 has children which are lower than 15; 7 has only 1 has a child which is lower than 7). The goal of the function is to transform the array so that at the end of the process we have a max-heap starting at element #2. Here is also the Python implementation of the Max-Heapify function:
As I mentioned before this function is not able to transform the starting array into a max-heap; we should iterate this function until we reached the root. The function that does this job is called Build-Max-Heap and it's much simpler:
Finally Heapsort!
Okay, we finally reached the sorting algorithm itself. I gotta admit: we passed the hard part, from now on it is veeery easy.
Once we have a max-heap we have the highest value at the root. This element will be the last one in the sorted array; so what are we going to do? We take it and we put it at the end of the array (switching it with the last). So what now? We just need to forget about the last element (reducing the dimension of the array by one) and to the same process again (calling Build-Max-Heap) and what will we get? The second highest value will be at the root! Put it as the last element, reduce the dimension of the array and then repeat.
Here is a final GIF (I got addicted to them):
Aaaaaaand here is the final snippet of code:
As usual, if you want to see the source code you can check my GitHub!
Algorithm Overview
Since this is the third post about sorting algorithms I will not repeat what are the math notations. If you need to refresh your mind you may look at the first post.
Best-case complexity: O(nlogn)
Worst-case complexity: O(nlogn)
Stable: Nope (if there are two elements with the same value their relative order will NOT be kept)
In-place: Yep (there is a maximum number of elements saved outside the starting array)
Conclusions
Heapsort is a nice sorting algorithm. It is fast, in-place (which means we prefer it when we do not want to use additional memory). If we compare it to Merge Sort this latter will win in terms of time, but the memory used will be much more (there is always a tradeoff).