We finally reached Quicksort, one of my favorite sorting algorithms and the most used in the world (until Timsort came in). Quicksort was invented in 1961 by Tony Hoare (remember the Monitor?. We should stop wasting time and let’s dive into it.
How Quicksort works
Quicksort is a divide and conquer algorithm. After choosing an element of the array (called pivot) it is partitioned in two: the first part contains all the elements that are ≤ than the pivot while the second part contains all the elements that are > than the pivot.
After this first reordering, it is clear that the pivot is in the right position. If we loop this recursively on both partitions we can sort the whole array.
In this algorithm we have two functions:
- Quicksort: the function that recursively sorts the sub-arrays
- Partition: the function that does the heavy lifting by partitioning the array in two sub-arrays with the properties I described before.
Is everything clear? In the next paragraph, there is a much more detailed explanation (with an illustrated GIT) and the Python implementation.
Quicksort Function
The first function I’m going to show you is Quicksort which is very trivial:
The function is very simple: it takes the index of the first element of the array (first) and the index of the last one (last). After a simple check, the partitioning is made using the partition function which will return a value (q) that identifies the index of the pivot. Then the function is called recursively.
Partition Function
The most interesting function is Partition. The first thing to decide is which element will be the pivot. Someone uses the first, some other the last: it is the same.
In this specific implementation, I’m going to use the last element.
So we have a starting array A; the first element will be A[r] while the last one (which will be the pivot) will be A[p].
All the elements to analyze are the ones from A[r] to A[p-1]. Those elements will be checked against the pivot to decide which partition they belong to. During the execution of the Partition function we can divide the array in 4 areas:
- Elements ≤ than the pivot (stored in positions p…i)
- Elements > than the pivot (stored in positions i+1….j-1)
- Unchecked elements (stored in positions j….r)
- The pivot itself
What are i and j? Those are the indexes we are going to use to divide the starting array into areas during the execution of the function.
Let’s take for example this specific execution of Partition. j is the index of the first unchecked element. The yellow and red areas are respectively the area of elements ≤ than the pivot and the area of elements > than the pivot.
The j-th element of the array is checked against the pivot:
- If the element is > than the pivot we just need to increase j (expand the red area);
- If the element is ≤ than the pivot we must do three things:
- Increase the value of i
- Switch A[i] with A[j]
- Increase the value of j
Here is, as usual, an explicative GIF:
Aaaaaand I can also show you the Python Implementation I made:
We just need to set the initial value of i to p-1 and then we are ready to go! The variable j iterates from p to r and at every iteration, we do the steps described earlier.
As usual, if you want to see the source code you can check my GitHub!
Algorithm Overview
Since this is the fourth post about sorting algorithms I will not repeat what are the math notations. If you need to refresh your mind you may look at the first post.
Best-case complexity: O(nlogn)
Worst-case complexity: O(n²)
Stable: Nope (if there are two elements with the same value their relative order will NOT be kept)
In-place: Yep (there is a maximum number of elements saved outside the starting array)
Conclusions
In the introduction, I told you that this is my favorite sorting algorithm, but there is much more left! As you can see the O notation in the worst-case scenario is O(n²) (which is quite bad). When does this happen? Is there any way to avoid it? In the next post, I will show you!