Array sorting and the run-time of a program or function

Of course, as well as being accurate, a good program will be fast. Sometimes, as for a program that solves quadratics by a formula, the program will run in a fixed amount of time. At other times a loop or recursion is needed. This web page shows how to analyse the run-time of a program in relation to two typical sorting algorithms.

1. Run time as O ( f ( n ) )

The section contains a review of the big O notation and how it relates to run time.

Some programs run in a fixed amount of time for each input. Most of this page concerns programs whose running time varies with the input.

It can be very difficult indeed to analyse a program and give a formula for the exact amount of time it uses. We do not do this here. Instead we simplify the process considerably. Typically, we might decide to measure the amount of time as the number of function calls, or numerical comparisons, or something else that a program does. Or we might decide to count the execution of each program line as a single "unit" of time. Various measures of time are appropriate in different situations, and you will be given a suitable measure in the situations you need to think about.

The most important question is how to assign time to loops and recursion.

In almost all cases where run time depends on the input there is some "obvious" parameter that it depends on. This parameter is always called n , and is almost always the "length" of the input. (How one interprets "length" may differ in different situations. It is usually the number of entries in an array. For multi-precision arithmetic where the number of digits can grow to any size that might fit in memory, n might be the number of digits in the input.)

We aim to describe the run-time of the program using a simple function f ( n ) . We then describe the run-time as being O ( f ( n ) ) . Most of the time very simple functions will suffice. For simplicity, we usually choose the function f ( n ) to be monotonic nondecreasing meaning that if n < n then f ( n ) f ( n ) .

The following functions are commonly seen in such analyses: 1 , n , n 2 , n 3 , and other powers; 2 n , 2 2 n , 2 3 n , ..., 2 2 n , and other exponential terms; log n , n log n , n 2 log n and other terms involving log; n log log n , n log n log log n and other terms involving log log. This list contains almost everything seen in practical situations and will certainly suffice for anything you encounter in this module.

The working definition of O ( f ( n ) ) that we will use for this module follows. It is slightly simpler than the most general definition possible. I've chosen this definition to make things as easy as possible at this stage. Recall, we are only interested here in simple monotonic functions f . This makes things a lot easier. In the following we assume f is non-decreasing, is not identically zero, and we look at f ( n ) as n .

Definition.

A function T ( n ) is O ( f ( n ) ) if there is a constant K such that T ( n ) K f ( n ) + K for all n . A program runs in time O ( f ( n ) ) if its worst-case running time T ( n ) is O ( f ( n ) ) .

The following exercises should help you understand the concepts.

Exercise.

Prove that if a function T ( n ) is O ( 1 ) then T ( n ) is bounded.

Exercise.

Prove that the function n ( n + 1 ) is O ( n 2 ) .

Exercise.

Prove that the function n 2 is O ( n 3 ) but n 3 is not O ( n 2 ) .

Exercise.

Prove that if the function T ( n ) is O ( log n ) where log is to base b then it is O ( log n ) for logarithms to any base.

For a simple program that gives a good idea of what this means in practice for runtime, see runtime.cpp. In this page we go through two worked examples.

2. Analysis of selectionsort

We have already seen the selectionsort method for sorting arrays. Here it is sorting a pseudorandom array of doubles.

// selectionsort.cpp (click here to download)

// illustrates selectionsort on a random vector of doubles
#include <iostream>
#include <vector>
#include <cstdlib>
#include <ctime>

using namespace std;

void selectionsort(vector<double>& vec) {
  int s = vec.size();
  for (int i=0; i<s; i++) {
    for (int j=i+1; j<s; j++) {
      if (vec[i] < vec[j]) {
	double x = vec[i];
	vec[i] = vec[j];
	vec[j] = x;
      }
    }
  }
}

int main () {
  vector<double> data;
  int nentries; 
  cout << "Number of entries? " << endl;
  cin >> nentries;

  /* initialize random seed: */
  srand(time(NULL));

  /* add nentries between 0 and 1 to data */
  for (int n = 0 ; n<nentries ; n++) {
    double p = double(rand())/RAND_MAX; 
    data.push_back(p);
  }

  selectionsort(data);

  /* print the sorted vector */
  for (int n = 0 ; n<nentries ; n++) {
    cout << data[n] << " ";
  }
  cout << "Sort complete" << endl;
  
  return 0;  
}

We are interested in the time taken by the function void selectionsort(vector<double>& vec) here. The n should obviously be the length of the vector, and it is clear that for longer vectors the program will take longer amounts of time.

Remark.

In fact, there is always a small time-overhead for starting up a function and closing it, so a better estimate of the time is actually B n ( n + 1 ) / 2 + C . However this is still O ( n ( n + 1 ) ) because according to the definition adding a constant doesn't matter.

Thus selectionsort is O ( n ( n + 1 ) ) . Actually we can simplify this further. Since n ( n + 1 ) = n 2 + n 2 n 2 selectionsort is O ( n 2 ) . You should also be able to see that selectionsort is not O ( n ) . That's because the analysis of the algorithm already given shows that always at least i = 0 n j = i n B steps are required for some constant B (possibly different to the previous B ) no matter how small B , C are and how large K is there will always be sufficiently large n for which B n ( n + 1 ) / 2 + C > K n + K .

Exercise.

Argue also that selectionsort is not O ( n log n ) either.

Arguably the most important measure of "time" is the number of comparisons required on double values in the vector. In other words the number of times the line "if (vec[i] < vec[j]) {" is executed. We can count this in a program. The result is:

// selectionsort_count.cpp (click here to download)

// illustrates selectionsort on a random vector of doubles
#include <iostream>
#include <vector>
#include <cstdlib>
#include <ctime>

using namespace std;

/** version of selectionsort that returns the number of comparisons made */
int selectionsort(vector<double>& vec) {
  int count = 0;
  int s = vec.size();
  for (int i=0; i<s; i++) {
    for (int j=i+1; j<s; j++) {
      // comparison coming up, so increment counter first
      count++;
      if (vec[i] < vec[j]) { // comparison 
	double x = vec[i];
	vec[i] = vec[j];
	vec[j] = x;
      }
    }
  }
  return count; // return number of comparisons
}

int main () {
  vector<double> data;
  int nentries; 
  cout << "Number of steps? " << endl;
  cin >> nentries;

  /* initialize random seed: */
  srand(time(NULL));

  /* add nentries between 0 and 1 to data */
  for (int n = 0 ; n<nentries ; n++) {
    double p = double(rand())/RAND_MAX; 
    data.push_back(p);
  }
  
  /* sort data */
  int count = selectionsort(data);

  cout << "Sort complete using " << count << " comparisons" << endl;
  
  return 0;  
}

Note too that a "random" data set like this can be useful for testing purposes but does not give the whole picture. We really want the "worst case" data set. In this case it doesn't really matter as this algorithm is equally bad on all data sets, but in other cases the difference between "random" and "worst case" data is important. (For example the most popular sorting algorithm of all, quicksort is usually O ( n log n ) on "random" data but is O ( n 2 ) in its worst case.)

Exercise.

Here is another popular easy sorting algorithm: bubblesort.cpp and a version that counts comparisons, bubblesort_count.cpp. Analyse its worst case run time.

3. Heapsort

Heapsort is more complicated than selectionsort but is still fairly simple and is a highly useful sorting algorithm.

You do not need to fully understand the details of how it works, but you do need to appreciate that some clever algorithms can sometimes reduce computer time significantly, and you need to have some understanding of the code, in particular roughly what the functions do. The description that follows is mostly for people that want more details.

The algorithm maintains a "heap", which is a vector of doubles with the special property that it is partially sorted in the sense that each entry in the vector may have two "children" in the vector (its so-called "left" and "right" children) and these values are always smaller in value than their parent. Of course when we go though the whole vector there will be entries at the end that don't have children, but that's OK.

By definition, the entry with index i has its children at indices 2 i + 1 and 2 i + 2 , if these are valid indices. Conversely an entry with index j has parent with index ( j - 1 ) / 2 where this is integer division rounding off any 1 / 2 . You can check that these functions run through the numbers 0 , 1 , n - 1 so this makes sense for an array of size n .

The heapsort algorithm simply converts the input vector into a heap (thus partially sorting it in the process) and then pulls the entries off the heap, in order, completing the sorting process as it goes. This is a slightly simplified version with count.

// minheapsort_count.cpp (click here to download)

/* Illustrates heapsort on a random array of doubles, this version
   returning a count of the number of comparisons needed
   This version is a "minimal" version cut down to the bare essentials.
 */

#include <iostream>
#include <vector>
#include <cstdlib>
#include <ctime>

using namespace std;

/** Print a vector to standard output */
void vector_check(vector<double>& data) {  
  int nentries = data.size();
  for (int n = 0 ; n<nentries-1 ; n++) {
    if (data[n]<data[n+1]) { cerr << "Error: not sorted!" << endl; return; }
  }
}

/** a version of selectionsort that returns the number of comparisons made */
int selectionsort(vector<double>& vec) {
  int count = 0;
  int s = vec.size();
  for (int i=0; i<s; i++) {
    for (int j=i; j<s; j++) {
      count++;
      if (vec[i] < vec[j]) {
	double x = vec[i];
	vec[i] = vec[j];
	vec[j] = x;
      }
    }
  }
  return count;
}

/** The left child of index n is the index 2n+1. */
int heap_left(int n) {  return 2*n+1; }

/** The right child of index n is the index 2n+2. */
int heap_right(int n) {  return 2*n+2; }

/** The index of the parent of index n, i.e. always have
 *  n = heap_left(heap_parent(n)) or heap_right(heap_parent(n))
 *  depending on whether n is odd or even.
 *  Returns -1 if there is no such parent, i.e. if n<=0. */
int heap_parent(int n) {  
  if (n<=0) return -1; else return (n-1)/2; // integer division
}

/** Internal adjustment(s): starting at n, ensures that
 * heap[heap_parent(i)] >= heap[i] for all i 
 * and assumes that the only possible exception to this is that 
 * we might have heap[heap_parent(n)] < heap[n],
 * so looks at n and ancestors of n only.
 * Runs in time O(log s) where s is the size of the heap. 
 * Returns the number of comparisons made. */
int heap_up(vector<double>& heap, int n) {  
  int count = 0;
  int i,j;
  i = n;
  double x = heap[i];
  while (i>0) {
    j = heap_parent(i);
    double y = heap[j];
    count ++;
    if (x <= y) break;
    heap[i] = y;
    heap[j] = x;
    i = j;
  }
  return count;
}

/** internal adjustment(s): starting at n, ensures that
 * heap[heap_parent(i)] >= heap[i] for all i and assumes that the only
 * possible exception is that it might be that 
 *    heap[n] < max( heap[heap_left(n)] , heap[heap_right(n)] ).  
 * So looks at all i  equal to n or a descendant of n */
int heap_down(vector<double>& heap, int n) {  
  int count = 0; 
  int i,jl,jr,p;
  i = n;
  int s = heap.size();
  while (i<s) {
    p = i;
    jl = heap_left(i);
    jr = heap_right(i);
    if (jl<s) p=jl;
    if (jr<s) {
      count ++; if (heap[jr]>heap[jl]) p=jr;
    }
    // so now p points to the larger of the two children
    // or the only child or p==i when there is o such child
    if (p==i) break;
    count ++; if (heap[p]<=heap[i]) break;
    // OK need to swap heap[i] and heap[p]
    double x = heap[i];
    heap[i] = heap[p];
    heap[p] = x;
    // continue with new i
    i = p;
  }
  return count;
}
  
/** Push v to heap. Runs in time O(log s) where s is the size of the heap. */
int heap_push(vector<double>& heap, double v) {  
  int n = heap.size();
  heap.push_back(v); // place new entry at the end, index n
  return heap_up(heap,n); // adjust returning count
}

/** Removes the maximum value from the whole heap, and preserves heap
 * property.  The maximum value is placed in x and number of
 * comparisons is returned on success. Runs in time O(log s) where s
 * is the size of the heap. */
int heap_pop(vector<double>& heap, double& x) {  
  int s = heap.size();
  if (s<=0) {
    std::cerr << "Error: empty heap\n";
    return -1;
  }
  x = heap[0];           // get first (largest) entry
  heap[0] = heap[s-1];   // copy last entry to top
  heap.pop_back();       // .. remove it ..
  return heap_down(heap,0); // .. and adjust returning count
}

/** Simplified heapsort with a separate vector.
 * Runs in time O(s log s) where s is the size of the vector. 
 * Returns the number of comparisons made. */
int heapsort(vector<double>& vec) {  
  int count = 0;
  int s = vec.size();
  vector<double> h; // an auxillary heap
  // push all entries onto heap
  for (int i = 0; i<s; i++) {
    count = count + heap_push(h, vec[i]);
  }
  // pop the entries back in order
  for (int i = 0; i<s; i++) {
    double x;
    count = count + heap_pop(h,x);
    vec[i] = x;
  }
  return count;
}

/** creates a random vector and sorts it by heapsort. */
int main () {
  int count = 0;

  vector<double> data1;
  vector<double> data2;
  int nentries; 
  cout << "Number of entries? " << endl;
  cin >> nentries;

  /* initialize random seed: */
  srand(time(NULL));

  /* fill the vector data */
  for (int n = 0 ; n<nentries ; n++) {
    double p = double(rand())/RAND_MAX; 
    data1.push_back(p);
    data2.push_back(p);
  }
  // vector_print(data1);
  cout << "Sorting" << endl;

  count = selectionsort(data1);
  vector_check(data1); // check sorted!
  cout << "Selectionsort complete with " << count << " comparisons" << endl;

  count = heapsort(data2);
  vector_check(data2); // check sorted!
  cout << "Heapsort complete with " << count << " comparisons" << endl;
  
  return 0;
}

This version copies the data to a heap and then pulls it off the heap in order.

The idea is rather more general than for simply sorting.

There are two ways a heap is used, and both are implemented here. First it can be as a method of sorting a vector. That's the heapsort function, and the second is that a heap is a kind of set of numbers where is is easy to add (or "push") numbers into the set and easy to remove (or "pop") the largest number. That's the heap_push and heap_pop functions. You can see that heap_push adds the entry at the back of the heap (if the heap had k entries then it adds a new one at index k ) but then has to reorganise the heap to make sure it is partially sorted. This requires comparing the new entry with its parent, possibly making a swap, and if necessary comparing again with the parent of the parent, and so on. This comparison and swapping is done in function heap_up.

To pull the greatest entry off the heap we simply take the entry with index 0 . But when that is removed something must go in its place. The easiest thing to add there is the last entry in the heap. But then we need to reorganise the heap to make sure it is partially sorted again. This requires comparing the new 0th entry with its children, possibly making a swap, and if necessary comparing again with the children of the children, and so on. This is done in function heap_down.

Here is the more general code, including a function that sorts a vector "in place" without using a separate heap.

// heapsort.cpp (click here to download)

// Illustrates heapsort on a random array of doubles

#include <iostream>
#include <vector>
#include <cstdlib>
#include <ctime>

using namespace std;

/* A heap provides a bit more than just a sorting routine.  It also
provides efficient methods to allow us to use a vector as a special
"stack" in which entries can be "pushed" and the largest entry popped.
(I.e. first-in-greatest-out; compare a normal "stack" which is
last-in-first-out and a "queue" which is first-in-first-out.)  It does
this by maintaining the "heap property" of a vector which says
v[heap_left(n)] <= v[n] and v[heap_right(n)] <= v[n] where heap_left
and heap_right are defined below. */

/** Print a vector to standard output */
void vector_print(vector<double>& data) {  
  int nentries = data.size();
  for (int n = 0 ; n<nentries ; n++) {
    cout << data[n] << " ";
  }
  cout << endl;
}

/** The left child of index n is the index 2n+1. */
int heap_left(int n) {  return 2*n+1; }

/** The right child of index n is the index 2n+2. */
int heap_right(int n) {  return 2*n+2; }

/** The index of the parent of index n, i.e.
 *  n = heap_left(heap_parent(n)) or heap_right(heap_parent(n))
 *  returns -1 if there is no such parent, i.e. if n<=0. */
int heap_parent(int n) {  
  if (n<=0) return -1; else return (n-1)/2; // integer division
}

/** Internal adjustment(s): starting at n, ensures that
 * heap[heap_parent(i)] >= heap[i] for all i 
 * and assumes that the only possible exception is that 
 * heap[heap_parent(n)] < heap[n] is currently possible,
 * so looks at n and ancestors of n only.
 * Runs in time O(log s) where s is the size of the heap. */
void heap_up(vector<double>& heap, int n) {  
  int i,j;
  i = n;
  double x = heap[i];
  while (i>0) {
    j = heap_parent(i);
    double y = heap[j];
    if (x <= y) break;
    heap[i] = y;
    heap[j] = x;
    i = j;
  }
}

/** Reorganises an arbitrary vector v to have the heap structure at
 * all indices. Runs in time O(s log s) where s is the size of the
 * heap. */
void heapify(vector<double>& heap) {  
  int s = heap.size();
  for (int i=1; i<s; i++) {
    // ensure that the vector is a heap for indices <= i
    heap_up(heap,i);
  }
}

/** internal adjustment(s): starting at n, ensures that
 * heap[heap_parent(i)] >= heap[i] for all i
 * and assumes that the only possible exception is that it might
 * be that heap[n] < max( heap[heap_left(n)] , heap[heap_right(n)] ).
 * So looks at all i equal to n or a descendant of n
 * The parameter k represents the "size".  If k>=0, then only entries
 * heap[0]..heap[k-1] are looked at.  If k=-1 the full vector is used.
 * Runs in time O(log s) where s is the size of the heap or O(log k) when k>=0. 
 */
void heap_down(vector<double>& heap, int n, int k=-1) {  
  int i,jl,jr,p;
  i = n;
  int s = heap.size();
  if (k>=0 && k<s) s=k;
  while (i<s) {
    p = i;
    jl = heap_left(i);
    jr = heap_right(i);
    if (jl<s) p=jl;
    if (jr<s && heap[jr]>heap[jl]) p=jr;
    // so now p points to the larger of the two children
    // or p==i when there are no such children
    if (p==i || heap[p]<=heap[i]) break;
    // swap heap[i] and heap[p]
    double x = heap[i];
    heap[i] = heap[p];
    heap[p] = x;
    // continue with new i
    i = p;
  }
}
  
/** Push v to heap. Runs in time O(log s) where s is the size of the heap. */
void heap_push(vector<double>& heap, double v) {  
  int n = heap.size();
  heap.push_back(v); // place new entry at the end, index n
  heap_up(heap,n);   // adjust
}

/** Removes the maximum value from the whole heap, and preserves heap
 * property.  The maximum value is placed in x and "true" is returned
 * on success. Runs in time O(log s) where s is the size of the heap. */
bool heap_pop(vector<double>& heap, double& x) {  
  int s = heap.size();
  if (s<=0) {
    std::cerr << "Error: empty heap\n";
    return false;
  }
  x = heap[0];           // get first (largest) entry
  heap[0] = heap[s-1];   // copy last entry to top
  heap.pop_back();       // .. remove it ..
  heap_down(heap,0);     // .. and adjust
  return true;
}

/** Returns maximum value from the portion of the heap from indices 0
 * .. k-1. This value is placed at the position with index k-1, and
 * the heap property for the portion of the heap from index 0 .. k-2
 * is maintained. This slightly complicated version of heap_pop is
 * provided for in-place sorting.  For speed, there is no error checking
 * here. Runs in time O(log s) where s is the size of the heap. */
void heap_getmax(vector<double>& heap, int k) { 
  double x = heap[0];    // first (largest) entry
  heap[0] = heap[k-1];   // copy last entry to top
  heap[k-1] = x;         // store greatest entry
  heap_down(heap,0,k-1); // and adjust
}

/** Sorts vector vec into decreasing order, using the heapsort method
 * in n log n steps, where n is the size of the vector. 
 * Runs in time O(s log s) where s is the size of the heap. */
void heapsort(vector<double>& vec) {  
  heapify(vec); // taking O(s log s) steps
  int s = vec.size();
  int k = s;
  while (k>0) { // loop taking another O(s log s) steps
    heap_getmax(vec,k);
    k--;
  }
  // the above sorts into reverse (increasing) order: switch to decreasing
  // order in time O(s).  Note that the sorted vector is then also a heap.
  int i = 0;
  while (true) { 
    int j = s-1-i;
    if (j<=i) break;
    double x = vec[j];
    vec[j] = vec[i];
    vec[i] = x;
    i++;
  }
}

/** creates a random vector and sorts it by heapsort. */
int main () {

  vector<double> data;
  int nentries; 
  cout << "Number of entries? " << endl;
  cin >> nentries;

  /* initialize random seed: */
  srand(time(NULL));

  /* fill the vector data */
  for (int n = 0 ; n<nentries ; n++) {
    double p = double(rand())/RAND_MAX; 
    data.push_back(p);
  }
  vector_print(data);

  /* sort data */
  heapsort(data);

  /* print the sorted vector */
  vector_print(data);

  cout << endl << "Sort complete" << endl;
  
  return 0;
}

Here is a quick analysis of heapsort. (Exercise: read this through and insert other details as required.) One way of sorting a set of n values is to push them onto an initially empty heap and then pop them off in order. The function heapsort does something slightly different because the data is already in an array and we don't want to have to use a new array. But the idea is the same.

In practice the difference is very significant. Run both programs with arrays of sizes 10000, 50000, 100000, etc.

Here is a version of the heapsort program that counts the number of comparisons and which also includes a version selectionsort so you can seethe difference.

// heapsort_count.cpp (click here to download)

/* Illustrates heapsort on a random array of doubles, this version
   returning a count of the number of comparisons needed */

#include <iostream>
#include <vector>
#include <cstdlib>
#include <ctime>

using namespace std;

/** Print a vector to standard output */
void vector_print(vector<double>& data) {  
  int nentries = data.size();
  for (int n = 0 ; n<nentries ; n++) {
    cout << data[n] << " ";
  }
  cout << endl;
}

/** a version of selectionsort that returns the number of comparisons made */
int selectionsort(vector<double>& vec) {
  int count = 0;
  int s = vec.size();
  for (int i=0; i<s; i++) {
    for (int j=i; j<s; j++) {
      count++;
      if (vec[i] < vec[j]) {
	double x = vec[i];
	vec[i] = vec[j];
	vec[j] = x;
      }
    }
  }
  return count;
}

/** The left child of index n is the index 2n+1. */
int heap_left(int n) {  return 2*n+1; }

/** The right child of index n is the index 2n+2. */
int heap_right(int n) {  return 2*n+2; }

/** The index of the parent of index n, i.e.
 *  n = heap_left(heap_parent(n)) or heap_right(heap_parent(n))
 *  returns -1 if there is no such parent, i.e. if n<=0. */
int heap_parent(int n) {  
  if (n<=0) return -1; else return (n-1)/2; // integer division
}

/** Internal adjustment(s): starting at n, ensures that
 * heap[heap_parent(i)] >= heap[i] for all i 
 * and assumes that the only possible exception is that 
 * heap[heap_parent(n)] < heap[n] is currently possible,
 * so looks at n and ancestors of n only.
 * Runs in time O(log s) where s is the size of the heap. 
 * Returns the number of comparisons made. */
int heap_up(vector<double>& heap, int n) {  
  int count = 0;
  int i,j;
  i = n;
  double x = heap[i];
  while (i>0) {
    j = heap_parent(i);
    double y = heap[j];
    count ++;
    if (x <= y) break;
    heap[i] = y;
    heap[j] = x;
    i = j;
  }
  return count;
}

/** Reorganises an arbitrary vector v to have the heap structure at
 * all indices. Runs in time O(s log s) where s is the size of the
 * heap. Returns the number of comparisons made. */
int heapify(vector<double>& heap) { 
  int count = 0; 
  int s = heap.size();
  for (int i=1; i<s; i++) {
    // ensure that the vector is a heap for indices <= i
    count += heap_up(heap,i);
  }
  return count;
}

/** internal adjustment(s): starting at n, ensures that
 * heap[heap_parent(i)] >= heap[i] for all i and assumes that the only
 * possible exception is that it might be that heap[n] < max(
 * heap[heap_left(n)] , heap[heap_right(n)] ).  So looks at all i
 * equal to n or a descendant of n The parameter k represents the
 * "size".  If k>=0, then only entries heap[0]..heap[k-1] are looked
 * at.  If k=-1 the full vector is used.  Runs in time O(log s) where
 * s is the size of the heap or O(log k) when k>=0. Returns the number
 * of comparisons made. */
int heap_down(vector<double>& heap, int n, int k=-1) {  
  int count = 0; 
  int i,jl,jr,p;
  i = n;
  int s = heap.size();
  if (k>=0 && k<s) s=k;
  while (i<s) {
    p = i;
    jl = heap_left(i);
    jr = heap_right(i);
    if (jl<s) p=jl;
    if (jr<s) {
      count ++;
      if (heap[jr]>heap[jl]) p=jr;
    }
    // so now p points to the larger of the two children
    // or p==i when there are no such children
    if (p==i) break;
    count ++;
    if (heap[p]<=heap[i]) break;
    // swap heap[i] and heap[p]
    double x = heap[i];
    heap[i] = heap[p];
    heap[p] = x;
    // continue with new i
    i = p;
  }
  return count;
}
  
/** Push v to heap. Runs in time O(log s) where s is the size of the heap.
 *  Returns the number of comparisons. */
int heap_push(vector<double>& heap, double v) {  
  int n = heap.size();
  heap.push_back(v); // place new entry at the end, index n
  return heap_up(heap,n);   // adjust
}

/** Removes the maximum value from the whole heap, and preserves heap
 * property.  The maximum value is placed in x and "true" is returned
 * on success. Runs in time O(log s) where s is the size of the heap.
 * Returns the number of comparisons or -1 on error. */
int heap_pop(vector<double>& heap, double& x) {  
  int s = heap.size();
  if (s<=0) {
    std::cerr << "Error: empty heap\n";
    return -1;
  }
  x = heap[0];           // get first (largest) entry
  heap[0] = heap[s-1];   // copy last entry to top
  heap.pop_back();       // .. remove it ..
  return heap_down(heap,0);     // .. and adjust
}

/** Gets the maximum value from the portion of the heap from indices 0
 * .. k-1. This value is placed at the position with index k-1, and
 * the heap property for the portion of the heap from index 0 .. k-2
 * is maintained. This slightly complicated version of heap_pop is
 * provided for in-place sorting.  For speed, there is no error
 * checking here. Runs in time O(log s) where s is the size of the
 * heap. Returns the number of comparisons made. */
int heap_getmax(vector<double>& heap, int k) { 
  int count = 0; 
  double x = heap[0];    // first (largest) entry
  heap[0] = heap[k-1];   // copy last entry to top
  heap[k-1] = x;         // store greatest entry
  count = heap_down(heap,0,k-1); // and adjust
  return count;
}

/** Sorts vector vec into decreasing order, using the heapsort method
 * in n log n steps, where n is the size of the vector.  Runs in time
 * O(s log s) where s is the size of the heap. Returns the number of
 * comparisons made. */
int heapsort(vector<double>& vec) {  
  int count = 0; 
  count += heapify(vec); // taking O(s log s) steps
  int s = vec.size();
  int k = s;
  while (k>0) { // loop taking another O(s log s) steps
    count += heap_getmax(vec,k);
    k--;
  } 
  // the above sorts into reverse (increasing) order: switch to decreasing
  // order in time O(s).  Note that the sorted vector is then also a heap.
  int i = 0;
  while (true) { 
    int j = s-1-i;
    if (j<=i) break;
    double x = vec[j];
    vec[j] = vec[i];
    vec[i] = x;
    i++;
  }
  return count;
}

/** creates a random vector and sorts it by heapsort. */
int main () {
  int count = 0;

  vector<double> data1;
  vector<double> data2;
  int nentries; 
  cout << "Number of entries? " << endl;
  cin >> nentries;

  /* initialize random seed: */
  srand(time(NULL));

  /* fill the vector data */
  for (int n = 0 ; n<nentries ; n++) {
    double p = double(rand())/RAND_MAX; 
    data1.push_back(p);
    data2.push_back(p);
  }
  // vector_print(data1);
  cout << "Sorting" << endl;

  count = selectionsort(data1);
  // vector_print(data1);
  cout << "Selectionsort complete with " << count << " comparisons" << endl;

  count = heapsort(data2);
  // vector_print(data2);
  cout << "Heapsort complete with " << count << " comparisons" << endl;
  
  return 0;
}

4. Summary

Some sorting algorithms are "clever" and significantly better than others. For example, heapsort requires only O ( n log n ) comparisons whereas selectionsort requires O ( n 2 ) comparisons, and this difference is significant in practice.

Be careful that you analyse the worst cases as well as the expected performance for random data. The most popular method for sorting, quicksort is O ( n log n ) most of the time (and most of the time out performs even heapsort) but is O ( n 2 ) in its worst case.