CLRS, Part II: A Quick Glance

The second part of the book, Sorting and Order Statistics, is mainly about processing and (surprise!) sorting an array of arbitrarily distributed data.It covers chapters 6 to 9, and goes from page 121 to 195.

Chapter 6: Heapsort

Chapter 6 introduces another sort algorithm, Heapsort. This algorithm uses the data structure heap to store and sort data.

Heaps

A (binary) heap is an array that can be viewed as a nearly complete binary tree. To better understand how the heap is build from a given sequence of numbers, you can read section 6.2 from the textbook. However, to make the long story short, a heap is a binary tree which is built with a single constraint: the heap property. The heap property states that if “B” is a child node of “A”, therefore key[A] >= key[B]. So, to ensure that the heap property is maintained throughout our exploring it’s features, we define the procedure MAX-HEAPIFY which will enforce a tree-node whose children are max-heaps but which is not a max field itself to become a max-heap (page 130).

Heapsort

Heapsort first builds a max-heap, then – since the number stored at A[1] is already the maximum – exchanges A[1] with A[n] and MAX-HEAPIFY’s A[1..n-1]. It then shrinks the input array by one, and repeats the process until every element is put into it’s correct final place.

If we had function Heap.buildMaxHeap(A) such that it would produce a max-heap from a given array A of n numbers, and static function Heap.maxHeapify(A, start, end) maintained the max-heap property for us, we could write the heap-sort like this:

Now, calling A = HeapSort.sort(A); would sort array A.

Priority Queues

After the heap-sort, the heap data structure is employed to implement a priority queue. It is based on the same concept as the heap used in the heapsort algorithm, with the only difference being that it uses a min-heap instead of a max-heap.

Chapter 7: QuickSort

QuickSort, is the best-known algorithm from the wide variety of sorting algorithms, and is one that employs the divide-and-conquer paradigm quite well. As there are a lot of resources describing it in short and long texts, I don’t think a briefing would be necessary.

Chapter 8: Sorting in Linear Time

First of all, in page 167, we are faced with the undeniable shortcoming of all comparison based sorting algorithms: They can’t perform better than big-Omega(n*lg(n)). Therefore, heapsort and mergesort are asymptotically optimal comparison sorts.

Counting Sort

Counting sort (page 168), assumes that each of the n input numbers is an integer between 0 and k, and manages to achieve the sorting of these elements with O(n) time. However, it uses a lot of space to do so.

Other Methods

Radix sort and Bucket sort are other methods described and analyzed in this chapter. But to be honest, I haven’t ever been compelled to use either of them. However, sorting in linear time is rather interesting subject, since it is a work of pure inspiration.

Chapter 9: Medians and Order Statistics

In this chapter, we are faced with the problem of finding the “i-th order statistics” of a set of numbers, which is the “i-th smallest number” in that set. Selecting this number in linear time is no problem at all. We modify the randomized quick sort algorithm to do this for us. Remember that since only sorting the lower half of the array is significant for us, we don’t bother with sorting it completely.

However, since quick sort can perform in the worst-case n^2 time, this still isn’t good enough. We conquer this by using a method which uses the median-of-medians. (page 190)

This new method operates in worst-case linear time.

Leave a Reply

Your email address will not be published. Required fields are marked *