Tips for Sorting
When sorting an array you can avoid moving large elements if
you declare a separate index array and just permute indices.
The original array can then be accessed in order via the index
array. If N is the size of the list, using an index array
saves O(N) or more data movement at the cost of using O(N)
additional storage.
See the file called
indexArraySorting,
for an example of this technique.
If you are going to sort a list "by" one key, and then "by"
some other key, then you want what is called a "stable sort."
In other words you don't want the order created by the first
sort to be disturbed by the second sort. Some sorts are stable
and some are not. InsertSort, SelectionSort, and BubbleSort
are basically stable, but you must write the code so that it
never performs a swap if two keys are equal. MergeSort too is
stable if you only make sure to favor the "left" list when keys
tie. QuickSort and HeapSort are *not* stable sorts.
Which sorting algorithm is best? What is best depends on many
factors. Here is some general advice:
For sorting an array
InsertSort is good for small lists of small elements because it
does about half as many comparisons as most simple sorts on
average, and does about the same number of swaps as it does
comparisons.
SelectionSort is good for small lists of large elements because
it does only N-1 moves, where N is the size of the list.
QuickSort is good for a long list of small elements.
Doing a QuickSort of an index array is good if you have a long
list of large elements. This avoids moving large elements.
For sorting a linked list
Here the size of elements does not matter because no moves are
required.
InsertSort is good for sorting a short list
RadixSort tends to be good for sorting a long list if the keys
are short. In such a case the RadixSort will have a small
number of passes.
MergeSort or QuickSort are good for sorting a long list that
has long keys.
Bubble Sort Is "Just a No-Good:" don't use it at all.
BubbleSort (also known as ExchangeSort) has nothing to recommend
it other than a "catchy name."
Distribution Sort Is a True O(N) Sort: use it when you
can!
Suppose you have a problem such as this: Sort 10,000 records
by key where the keys are the integers from 0 to 9999. A job
like this can be done with a Distribution Sort with just O(N)
work.
someType sourceList[10000], targetList[10000] ;
int idx ;
/* SORT THE LIST */
for (idx = 0; idx < 10000; idx++)
targetList[sourceList[idx].key] = sourceList[idx] ;
/* DISPLAY THE SORTED LIST */
for (idx = 0; idx < 10000; idx++) print ( targetList[idx] ) ;