Coursework Disclaimer: This project is coursework that is reused year to year; Details are left somewhat vague so as to not provide answers.
CSCE 221 is Introduction to Data Structures and Algorithms. It is a very fun course; so much so that I went on to TA this course four times. You can read more on the TA Page
The assignment I want to discuss was about sorting algorithms. In this assignment we were asked to implement and compare four sorting algorithms in C++: Bubble Sort, Heap Sort, Quick Sort, and Merge Sort. We compared the runtimes on varying input types and sizes, checked that the experimental results matched the expected Big(O) behavior, and discussed the various algorithms. Its fairly standard computer science coursework.
What wasn't standard was my implementation of merge sort. A reasonable implementation of merge sort could use up to O(N/2) overhead, where the half the list is copied into another block of memory. However, a really, really different solution is to copy the entire list into another block of memory for O(N) overhead. This may sound really dumb (and it probably is) but it has one key benefit: you can sort really really fast.
The idea is to sort back and forth between the two blocks of memory, each of which contains the entire list. At each level of merge sort, the merging can happen from one block into a appropriate spot in the other list. Using pointers, this can all happen seamlessly. However, one big benefit emerges: you start packing memory into cache lines; rather than fetching arrays of smaller and smaller sizes from random places in memory, you can fetch the whole block. Its still O(NlogN), but it runs very fast.
If you consider the base case to be the one or two element list, it can be shown that all sub-lists will recurse the same number of times. You have to be very careful and explicit about your pointers; but it can be done (and its correct).
The following table contains the experimental runtime of the code I wrote:
(in ms) | Random Merge Sort | Random Quick Sort | Random Heap Sort |
---|---|---|---|
100000 | 14.1 | 23.8 | 223.9 |
1000000 | 164.3 | 319.2 | 2782.5 |
10000000 | 1460.2 | 6499.3 | 36713.6 |
Sometime in the near future I plan on adding an image explaining this process and revisiting this solution with an additional 3+ years of programming experience.