O() notation

Of course, as well as being accurate, a good program will be fast. This page sets out the mathematical notation we use to describe how fast a program is. This notation is called "Big O" notation and aims to give an upper bound of the worst case time. These details will be filled in later. The important thing here is to explain the notation which is a definition similar to the idea of the limit of a sequence that you learnt in first year.

Warning: some mathematicians and/or web pages define this notation in an incorrect way that is incompatible with that here.

1. Big O

We are going to try to compare the behaviour of a function f against a function g . We want to say that " f grows no faster than g up to a constant". The official definition is as follows.

Definition.

Let f , g : . We write " f ( x ) = O ( g ( x ) ) as x " to mean there is M > 0 and x 0 such that for all x x 0 we have f ( x ) M g ( x ) .

In words, " f ( x ) = O ( g ( x ) ) as x " means that eventually (i.e. from some point x 0 onwards) we have that f ( x ) g ( x ) up to a multiplicative constant M > 0 .

This is the official definition. There are similar definitions of related ideas, but we do not need them. Note that O ( ) is not a function. The strange notation f ( x ) = O ( g ( x ) ) is not saying that f is a function of g or anything. It's just a notation, nothing else.

For this module, we need only look at special cases of this. The first thing to say is that we will only need to look at the case of functions f , g from to the nonnegative reals, + . The fact that our functions are defined on the naturals rather than on all reals makes no difference to the above definition: the definition works in exactly the same way for just the same reasons with natural number inputs. The fact that our functions map to nonnegative reals means that the absolute value signs can be omitted.

Secondly, we are only interested in the case when the function g is a nondecreasing function, i.e. g ( x ) g ( y ) for x y . This simplifies quite a few things to do with the definition.

Since inputs to functions are natural numbers rather than reals, we will always use the letter n for them instead. Since we will never be interested in any other limit except that when n we omit the phrase "as n ". Then the definition becomes,

Definition.

Let f , g : + . We write " f ( n ) is O ( g ( n ) ) " to mean there is M > 0 and n 0 such that for all n n 0 we have f ( n ) M g ( n ) .

Example.

Let f , g : + and suppose the function g is the constant function with value zero. Then f ( n ) = O ( g ( n ) ) holds if and only if the function f is eventually constant with value zero.

Example.

Let f , g : + and suppose the function g is nondecreasing and is not the constant function with value zero. Then f ( n ) = O ( g ( n ) ) holds if and only if there is M > 0 such that for all n , f ( n ) M g ( n ) + M .

Indeed, for the most tricky part of this argument, if f ( n ) = O ( g ( n ) ) then there are M , n 0 such that f ( n ) M g ( n ) for all n n 0 . So f ( n ) M g ( n ) + max g ( 0 ) , , g ( n 0 - 1 ) for all n and by replacing M with max M , g ( 0 ) , , g ( n 0 - 1 ) we get the result we need.

2. Run-time as O ( g ( n ) )

Some programs run in a fixed amount of time for each input. Most of this page concerns programs whose running time varies with the input.

It can be very difficult indeed to analyse a program and give a formula for the exact amount of time it uses. We do not try to do this here. Instead we simplify the process considerably. We decide that, as a useful simplification each program line doing one step in the program takes one unit of time. (In practice this is not true, but usually each program line takes some constant amount of time, where the constant depends on how complicated the expression is, and we regard that constant as being essentially 1, at least up to a multiplicative factor.) The question is how to assign time to loops.

In almost all cases where run-time depends on the input there is some "obvious" parameter that it depends on. This parameter is always called n , and is almost always the "length" of the input. (How one interprets "length" may differ in different situations. It is usually the number of entries in a matrix or array, or the number of vertices in a graph. For multi-precision arithmetic where the number of digits can grow to any size that might fit in memory, n might be the number of digits in the input. In any particular case you will be told exactly what n is.)

In cases where several different inputs can have the same "length" parameter n we always take the run-time f ( n ) to be the worst case time taken over all these possible inputs.

We aim to describe the run-time of the program using a simple nondecreasing function g ( n ) . We can then describe the run-time as being O ( g ( n ) ) . Most of the time very simple functions will suffice.

The following functions g ( n ) are commonly seen in such analyses: 1 , n , n 2 , n 3 , and other powers; 2 n , 2 2 n , 2 3 n , and so on, also 2 2 n , 2 2 2 n , and other exponential terms like these; log n , n log n , n 2 log n and other terms like these involving log; also occasionally n log log n , n log n log log n and other similar terms involving log log.

This list just given contains almost everything seen in practical situations and will certainly suffice for anything you encounter in this module.

Example.

If a program takes no more than 2018 steps whetever its input is, then its run-time is O ( 1 ) .

In this subject it rarely matters what base logarithms are taken to. (See an exercise later on on this page for why.) Because of this it is usually convenient to take logarithms to base 2. The logarithm to base 2 of a positive integer is approximately the number of binary digits required to write that number down.

3. Examples and exercises

The following exercises should help you understand the concepts.

Exercise.

Prove that if a function T ( n ) is O ( 1 ) then T ( n ) is bounded.

Exercise.

Prove that the function n ( n + 1 ) is O ( n 2 ) .

Exercise.

Prove that if the function T ( n ) is O ( log n ) where log is to base b > 1 then it is O ( log n ) for logarithms to any base greater than 1.