Let us consider another way of of implementing a selection sorting algorithm. The underlying idea is that it would help if we could pre-arrange the data such that selecting the smallest/biggest entry becomes easier. For that, remember the idea of a priority queue discussed earlier. We can take the item of an item to give it a priority. Then if we remove the item with the highest priority at each step we can fill an array `from the rear', starting with the biggest item.
Now priority queues can be implemented in different ways and we discussed an implementation using heap-trees. Another way of implementing them would be using a sorted array, so that the entry with the highest priority appears in data[size]. Removing this item would be very simple, but inserting a new one would always involve shifting a number of items to the right to make room for it:
| n | 0 | 1 | 2 | 3 | 4 | 5 |
| data[n] | 1 | 2 | 4 |
| n | 0 | 1 | 2 | 3 | 4 | 5 |
| data[n] | 1 | 2 | 4 |
| n | 0 | 1 | 2 | 3 | 4 | 5 |
| data[n] | 1 | 2 | 3 | 4 |
A third way would be to use an unsorted array: A new item would be inserted by just putting it into data[size+1], but to delete the entry with the highest priority one would have to find it first. After that, the items with a higher index would have to be `shifted down'.
Of those three representations, only one is of use in carrying out the above idea: An unsorted array is what we started from, so that isn't any help, and ordering the array is what we are trying to achieve.
To make use of heap-trees, we first of all have to think of a way of
taking an unsorted array and re-arranging it in such a way that it
becomes a heap-tree. One possibility would be to insert the items one
by one, using the insert algorithm discussed earlier. It turns out,
however, that this can be done more efficiently. First of all note
that if we have
items in the array data in positions 1,
..., n, then all the items with an index greater than
will be
leaves. Therefore if we `trickle down' all the items data[n/2],
..., data[1] by exchanging them with the larger of their children
until they either are positioned at a leaf, or until their children
are both smaller, we obtain a heap-tree.
| 5 | 8 | 3 | 9 | 1 | 4 | 7 | 6 | 2 |
So the algorithm starts by trickling down 9, which turns out not to be nececssary, so the array remains the same. Next 3 is trickled down, giving:
| 5 | 8 | 7 | 9 | 1 | 4 | 3 | 6 | 2 |
Next 8 is trickled down, giving:
| 5 | 9 | 7 | 8 | 1 | 4 | 3 | 6 | 2 |
Finally, 5 is trickled down to give first
| 9 | 5 | 7 | 8 | 1 | 4 | 3 | 6 | 2 |
then
| 9 | 8 | 7 | 5 | 1 | 4 | 3 | 6 | 2 |
and finally
| 9 | 8 | 7 | 6 | 1 | 4 | 3 | 5 | 2 |
The time complexity of this algorithm is as follows: it
trickles down
items, those with indices 1,
...,
. Each of those trickle operations involve
two comparisons at each stage. Now an item with index
will will be
on level
, which means that there are
steps until a leaf is reached, so that the trickle process
for the item at position
may stop. Hence the total number of
comparisons carried out to trickle data[i] into position is at
most
. So the number of comparisons involved at most is
Now that we have a heap-tree, we want to get a sorted array out of
it. In the heap-tree, the item with the highest priority, that is
the item with the largest item, in data[1]. In a sorted
array, it should be in position data[size]. We then swap the
two--which is almost the same as removing the root of the heap-tree,
since data[size] is precisely the item that would be moved
into the root position at the next step. Since now data[size]
contains the correct item, we will never have to look at it
again. Instead, we take the items data[1], ..., data[size-1]
and rearrange them into a heap-tree with the trickle procedure, which
we know to have complexity
.
Now the second largest item is in position data[1], and we know its final position will be data[size-1], so we now swap these two items. Then we rearrange data[1], ..., data[size-2] back into a heap-tree using the trickle procedure.
When the
th step has been completed, the items data[n-i+1],
..., data[n] will have the correct entries, and there will be a
heap-tree in the items data[1], ..., data[n-i]. (Note that the
size, and therefore the height, of the heap-tree decreases.) As a
part of the
th step, we will have to trickle the new root down.
This will take at most twice as many comparisons as the heap-tree is
high at the time, which is the logarithm (to base 2) of the number of
items in the heap-tree at the time, that is
.
Hence the complexity function for this phase of the algorithm will be at most
So the worst-case complexity of the entire sorting algorithm, that is
first rearranging the (unsorted) array into a heap-tree (which
is proportional to
) and secondly making a sorted array out
of the heap-tree (which is proportional to
) is given by the
sum of the two complexity functions. Since the term
grows faster than
, we can simplify
to
.