Thomas W. Christopher | Consulting in High Performance Computing. Training in Concurrency, Networking, and Parallelism, especially in Python and Java. |
Abstract
Continuing a tutorial on sorting algorithms, this page animates quick sort.
Author
Thomas W. Christopher
Quick Sort is our first O(N log N) sorting algorithm. That is, it runs in time O(N log N) in the average case. Actually, it can run in time O(N^{2}) in the worst case. For Quick Sort, the worst case occurs when the data is already sorted.
Quick Sort works as follows: We choose an element of the array to be the partition element. Through a series of exchanges, we partiton the elements of the array. We move the elements around so that all elements to the left of the partition element are less than or equal to it and all elements to the right are greater than or equal to it. Now when we sort the left and right sides, the array will be in order. Naturally, we sort the two sides using Quick Sort.
Here is, loosely, how we calculate the running time of Quick Sort: If we choose the median element each time to partition around, then we will be sorting single element subarrays in log N recursive calls. That gives us a factor of log N. The amount of work we must do at each level of recursion is O(N). The time required to partition N elements in each of the implementations below is N. At the first call, there is one array of length N to partition. There are two calls at the second level, each with N/2 elements to partition. There are four calls at level 3, each with N/4 elements to partition.
So we have the running time T=N + 2N/2 + 4N/4 + ...2^{k}N/2^{k}, where k=log N.
Or T=N log N.
But the partition element is probably not the median, someone says. Well, when we do the calculations precisely and considering all the possible partition elements, we still get the average running time of N log N. I will not, however, show the precise calculations here, for fear of boring you. (And me.)
The differences among the examples that follow lies in how they partition the subarrays. I've always found my partitioning code to be bug-ridden, so the order of presentation is dictated by what I thought I could get working most easily.
Quick Sort This first version partitions into three subregions: elements less than the pivot, elements equal to the pivot, and elements greater than the pivot.
Our method quicksort(m,n) is called to sort the elements in array positions [m,n),
that is to say, the semiopen range from m up to but not including n.
If there are two or fewer elements in the region, we sort them trivially. Otherwise we
pick the element at position m as the pivot element and set indices i, j, and k
i=m
j=m+1
k=m+1
The invariants will be:
all elements in [m,i) are less than pivot
all elements in [i,j) are equal to pivot
all elements in [j,k) are greater than pivot
the elements in [k,n) have not been examined yet.
Partitioning works by repeatedly examining the next element at position k and with zero, one, or two exchanges moving it to the end of the region it belongs in. Here is the code:
void quicksort(int m, int n) {
int i,j,k,pivot;
if (n<=m+1) return;
if (n-m==2) {
if (get(n-1)<get(m)) {
exchange(m,n-1);
}
return;
}
i=m;
j=i+1;
k=j;
pivot=get(i);
setColor(i,Color.red);
while(k<n) {
int v=get(k);
if (v<pivot) {
exchange(k,j);
exchange(i,j);
i++; j++;
}
else if (v==pivot) {
exchange(k,j);
j++;
}
k++;
}
if (i-m < n-j) {
quicksort(m,i);
quicksort(j,n);
} else {
quicksort(j,n);
quicksort(m,i);
}
}
We sort recursively the smaller side first. Actually, the sorting of the larger side should not be a recursive call; we should remove the tail call by assigning a new bound to m or n and jumping back to the top of the quicksort method.
The problem with this method is it is inefficient: it costs two exchanges for each element less than the pivot.
Quick Sort 2 What we would really like is to only move elements if they have to be. In our second quicksort implementation, we increment index i from m and decrement index j from n. If the element at position j is less than or equal to the pivot, we move it down. If the element at position i is greater than or equal to the pivot, we move it up. When i and j meet, we are done. However, the devil is in the details. The problem is knowing when to stop moving j downwards or i upwards and knowing the exact state of the array when i and j meet.
Method quicksort(m,n) is called to sort the elements in the range [m,n). If n<m+2 we do a trivial sort of the 0, 1, or 2 elements. Otherwise, we set i=m and j=n. We choose the element at position m to be the pivot. The invariants are:
all elements in [m,i] are less than or equal to the pivot
all elements in [j,n) are greater than or equal to the pivot
the elements in (i,j) have not been examined yet.
We execute an outer loop while i<j. There are two inner loops.
First, we repeatedly subtract one from j until we find an element less than or equal to the pivot. Since the pivot is at the bottom of the range, we cannot fall off the bottom of the array. The pivot itself will stop j if nothing else does.
If i=j, we break out of the outer loop. Otherwise we exchange the element at position j and the pivot element at position i.
Second, we repeatedly increment i until we find an element greater than or equal to the pivot. Since the pivot is now at position j, we cannot fall off the top of the array. The pivot itself will stop i if nothing else does.
We now exchange the elements at positions i and j. If i=j, no harm is done. Now the pivot is again at the bottom of the range [i,j]. If i<j, we continue the outer loop.
When we terminate the outer loop, i=j and the pivot is in position i.
Here is the code for the partitioning part of the method:
i=m;
j=n;
pivot=get(i);
while(i<j) {
j--;
while (pivot<get(j)) j--;
if (j<=i) break;
exchange(i,j);
i++;
while (pivot>get(i)) i++;
exchange(i,j);
}
Quick Sort 3 The third attempt tries to cut down further on the number of exchanges by removing the pivot element from the range [i,j].
Method quicksort(m,n) is called to sort the elements in range [m,n), that is from
position m up to but not including position n. As with the other versions, we sort 0, 1,
or 2 elements specially for for the following algorithm, we can assume there are three or
more elements in the subarray.
We sort the elements in positions m+1, m, and n-1. The value at position m is the median
of the three; at m+1, the smallest; and n-1, the largest.
We set i=m+1 and j=n-1. As before we have an outer loop with two inner loops. The outer loop iterates while i<j.
One inner loop decrements j while the element at position j is greater than the pivot. The other loop increments i while the element at position i is less than the pivot. If i and j meet, we exit the outer loop. Otherwise, we exchange the values at positions i and j and continue.
We will not fall off the end of the array because the values at positions m+1 and n-1 will stop j and i respectively.
Once i meets j, we exchange the pivot from position m with the element at position j. This is legitimate because index j was stopped by an element less than or equal to the pivot.
Here's the code:
void quicksort(int m, int n) {
int i,j,pivot;
if (n<=m+1) return;
if (n-m==2) {
if (get(n-1)<get(m)) {
exchange(m,n-1);
}
return;
}
i=m+1;
j=n-1;
if (get(i)>get(m)) {
exchange(i,m);
}
if (get(j)<get(m)) {
exchange(m,j);
}
if (get(i)>get(m)) {
exchange(i,m);
}
pivot=get(m);
setColor(m,Color.red);
while(true) {
j--;
while (pivot<get(j)) j--;
i++;
while (pivot>get(i)) i++;
if (j<=i) break;
exchange(i,j);
}
exchange(m,j);
if (j-m < n-j) {
quicksort(m,j);
quicksort(j+1,n);
} else {
quicksort(j+1,n);
quicksort(m,j);
}
}
Parallel Quick Sort 2 Naturally, quicksort is easy to parallelize. After the partition, each side can be sorted in parallel. Here is the second version of quick sort modified to fork off a thread for each side of the partition.
You might expect we could run the algorithm in log N time, since N processors should be able to share the work evenly, the algorithm is N log N, and (N log N)/N is log N. However, even if the loads broke down evenly and we got something like N threads working at the same time, we still could not run this sorting algorithm in log N time. The partition step takes N time and cannot be run in parallel. The first partition itself has already cost us O(N) before any parallel threads are forked off.
Notice that the two sides of the partition are not usually equal, which means that the two threads do not have balanced loads. Even the part that can be run in parallel loses some efficiency here.
For that matter, you are probably watching this on a sequential machine, giving you only a semblance of parallelism; however, if you are running on a genuine multiprocessor with JDK1.2, you may actually be observing parallelism in action.
When we say threads are forked off to handle each side of the partitioned array, that is not precise. What happens is that a Runnable is created for one side of the array and the thread that did the partitioning recursively handles the other side. The Runnable is placed in a RunQueue to be processed by an available thread. (There are four threads available for running these sorts.) Subarrays of less than a certain size (16 elements) are sorted by insertion sort, as is typically done in quicksort. These parallel sorts use a TerminationGroup to signal their completion.
These AdSense ads may be a little inappropriate for an academic page, but they may help pay the costs of maintaining the website.