The bisection method: roots of a cubic

This web page explains the bisection method for the problem of finding roots of a cubic. The equation to be solved is $X^{3} + a X^{2} + b X + c = 0$ . (Clearly we may assume the leading coefficient is $1$ since if $X$ is a root of $d X^{3} + a X^{2} + b X + c = 0$ with $d \neq 0$ then it is also a root of $X_{3} + \frac{a}{d} X^{2} + \frac{b}{d} X + \frac{c}{d} = 0$ .) In fact, we could avoid another coefficient too, but for the sake of generality do not do so here.

Exercise.

By setting $Y = X + λ$ for a suitable choice of $λ$ show that an equation of the form $X^{3} + a X^{2} + b X + c = 0$ can be reduced to one of the form $Y^{3} + b' Y + c' = 0$ .

Although there is a "formula" for cubics (and quartics, but not quintics and higher), it is complicated and not at all obvious whether it is worth using it for problems of this type.

The method of using the "formula" for the cubic does not generalise to other equations.
The method of using the "formula" may have other accuracy problems similar to that for the quadratic.
The simplicity of the bisection method means it is quick and easy to program and easy to debug and check. It also comes with built-in error estimates.

1. Bisection

Bisection method.

Let $p (X) = X^{3} + a X^{2} + b X + c = 0$ . Suppose we have values $x_{low}, x_{high} \in ℝ$ such that $p (x_{low})$ and $p (x_{high})$ have different signs (one is positive and the other is negative). Then by continuity of $p$ there is a root $x \in [x_{low}, x_{high}]$ (possibly more than one). We shall find one by a computer algorithm.

Set $x_{mid} = (x_{low} + x_{high}) / 2$ , the mid-point. Then either $p (x_{mid}) < 0$ or $p (x_{mid}) > 0$ . (Or this equals zero in which case we have a root.) So in either case we can replace one of $x_{low}$ or $x_{high}$ with $x_{mid}$ and ensure that $p (x_{low})$ and $p (x_{high})$ still have different signs. In either case the length of the interval $[x_{low}, x_{high}]$ in which a root is to be found has halved, so we know a root to a better accuracy than before. This process can be iterated until we have an answer to sufficient accuracy.

It remains to find the initial values $x_{low}, x_{high} \in ℝ$ . For many problems this is not easy, but for a cubic with leading coefficient $1$ , as we have here, suitable values are easy to find.

The method I used for $x_{high}$ goes like this.

Start with $x_{high} = 1$ and if necessary increase this value $x_{high}$ . Whatever value we take will be at least $1$ .
To arrange that $p (x) ⩾ 0$ it suffices that $\frac{1}{3} x^{3} ⩾ - a x^{2}$ , $\frac{1}{3} x^{3} ⩾ - b x$ , $\frac{1}{3} x^{3} ⩾ - c$ , and (simplifying to avoid calculating square roots and cube roots, for $x ⩾ 1$ ) it suffices that all three of $\frac{1}{3} x ⩾ - a$ , $\frac{1}{3} x ⩾ - b$ , $\frac{1}{3} x ⩾ - c$ . So we take $x_{high}$ to be the greatest of $1, -3 a, -3 b, -3 c$ .
Similarly we take $x_{low}$ to be the least of $-1, -3 a, -3 b, -3 c$ .

// cubicroot.cpp (click here to download)

// solves cubics using bisection method

#include <iostream>
#include <cmath>

using namespace std;

double a,b,c; // coefficients, accessible from main and p()

double p(double x) { 
  return c + x*(b + x*(a + x));
}

int main() {
  cout.precision(16); // maximum precision for double, for illustration only

  cout << "This program solves the equation p(X)=0 where p(X) = X^3 + aX^2 + bX + c." << endl;
  cout << "Input the values for a, b, c." << endl;
  cin >> a >> b >> c;
  cout << "The polynomial is p(X) = X^3 + " << a << "X^2 + " << b << "X + " << c << endl;

  // bracket the roots, here using properties of the cubic
  double xlow = -1.0;
  if (xlow>=-3.0*a) xlow = -3.0*a;
  if (xlow>=-3.0*b) xlow = -3.0*b;
  if (xlow>=-3.0*c) xlow = -3.0*c;
  double xhigh = 1.0;
  if (xhigh<=-3.0*a) xhigh = -3.0*a;
  if (xhigh<=-3.0*b) xhigh = -3.0*b;
  if (xhigh<=-3.0*c) xhigh = -3.0*c;
  if (p(xlow) > 0.0 || p(xhigh) < 0.0) { // double check for safety
    cout << "Oops! bound estimate doesn't work (this shouldn't happen)" << endl;
    return 1;
  }

  double rel_error_required = 1.0E-10;
  double xmid, xerr, xrelerr;
  int n = 0; // number of iterations, a failsafe to stop program hanging
  while (true) {
    // compute midpoint and see how accurate it is
    xmid = (xlow + xhigh) / 2.0;
    xerr = xhigh - xmid;
    xrelerr = abs ( xerr / xmid ); 
    n++;
    if (xrelerr < rel_error_required || n>100) break;
    cout << "Approximation: " << xmid << " accurate to pm " << xerr <<endl;
    // now replace one of xlow, xhigh
    double y = p(xmid);
    if ((y<=0.0 && p(xlow)>=0.0)||(y>=0.0 && p(xlow)<=0.0)) {
      xhigh = xmid; 
    } else {
      xlow = xmid; 
    }
  }

  cout << "A root is " << xmid << " to plus or minus " << xerr << " using " << n << " iterations" << endl;
  return 0;
}

Two programming points are worth mentioning here.

Notice the way that the coefficients a,b,c are declared outside the main function and outside p. This way the values of these variables are available to both functions. This is probably OK in this case as the program is short, but can problematic in longer code, because global variables (as they are called) are easy to forget about and difficult to manage well. One alternative is to use namespaces, a C++ feature we don't discuss here. There are other more advanced and more powerful alternatives too, such as using classes or structures in C++, or passing functions as arguments.
Note too that the code for the function p evaluates the polynomial in an "unusual" order. Nevertheless, the code for p is faster. Multiplications are considerably slower than additions, and this code reduces the number of mutiplications to two instead of five for x*x*x+ a*x*x + b*x + c.

2. Criticism

Criticising the code, there are a number of points to be made.

Convergence is "steady" but not fast. (It is so-called linear convergence, or convergence of order $1$ , meaning each iteration multiplies the existing error by a constant, in this case, 0.5.) Many people would prefer convergence to be faster. We will look at a possible method later.

There is an issue concerning what might be meant by "relative error" in a root if that root is very close to zero.

But most importantly,

There might be some concern about the evaluation of p and whether the sign of the answer is correct. Note that there are a number of possible places in the calculation of p where cancellation of digits occurs.

To illustrate,

This program solves the equation p(X)=0 where p(X) = X^3 + aX^2 + bX + c.
Input the values for a, b, c.
-3 3 -1
The polynomial is p(X) = X^3 + -3X^2 + 3X + -1
Approximation: 0 accurate to pm 9
Approximation: 4.5 accurate to pm 4.5
Approximation: 2.25 accurate to pm 2.25
Approximation: 1.125 accurate to pm 1.125
Approximation: 0.5625 accurate to pm 0.5625
Approximation: 0.84375 accurate to pm 0.28125
Approximation: 0.984375 accurate to pm 0.140625
Approximation: 1.0546875 accurate to pm 0.0703125
Approximation: 1.01953125 accurate to pm 0.03515625
Approximation: 1.001953125 accurate to pm 0.017578125
Approximation: 0.9931640625 accurate to pm 0.0087890625
Approximation: 0.99755859375 accurate to pm 0.00439453125
Approximation: 0.999755859375 accurate to pm 0.002197265625
Approximation: 1.0008544921875 accurate to pm 0.0010986328125
Approximation: 1.00030517578125 accurate to pm 0.00054931640625
Approximation: 1.000030517578125 accurate to pm 0.000274658203125
Approximation: 0.9998931884765625 accurate to pm 0.0001373291015625
Approximation: 0.9999618530273438 accurate to pm 6.866455078125e-05
Approximation: 0.9999961853027344 accurate to pm 3.4332275390625e-05
Approximation: 0.9999790191650391 accurate to pm 1.71661376953125e-05
Approximation: 0.9999876022338867 accurate to pm 8.58306884765625e-06
Approximation: 0.9999918937683105 accurate to pm 4.291534423828125e-06
Approximation: 0.9999940395355225 accurate to pm 2.145767211914062e-06
Approximation: 0.9999951124191284 accurate to pm 1.072883605957031e-06
Approximation: 0.9999956488609314 accurate to pm 5.364418029785156e-07
Approximation: 0.9999959170818329 accurate to pm 2.682209014892578e-07
Approximation: 0.9999960511922836 accurate to pm 1.341104507446289e-07
Approximation: 0.999996118247509 accurate to pm 6.705522537231445e-08
Approximation: 0.9999961517751217 accurate to pm 3.352761268615723e-08
Approximation: 0.999996168538928 accurate to pm 1.676380634307861e-08
Approximation: 0.9999961769208312 accurate to pm 8.381903171539307e-09
Approximation: 0.9999961811117828 accurate to pm 4.190951585769653e-09
Approximation: 0.9999961832072586 accurate to pm 2.095475792884827e-09
Approximation: 0.9999961842549965 accurate to pm 1.047737896442413e-09
Approximation: 0.9999961847788654 accurate to pm 5.238689482212067e-10
Approximation: 0.9999961850407999 accurate to pm 2.619344741106033e-10
Approximation: 0.9999961851717671 accurate to pm 1.309672370553017e-10
A root is 0.9999961851062835 to plus or minus 6.548361852765083e-11 using 38 iterations

We can easily see that the only root to $p (X) = X^{3} - 3 X^{2} + 3 X - 1$ is $1$ , and $0.9999961851062835$ is certainly not equal to $1$ within an error of $6.548361852765083 \times 10^{-11}$ . Although the root is approximately correct, the error estimate that seemed guaranteed by the method has not worked.

On investigation, the problem is that the derivative of the polynomial is very small (actually zero, but it only needs be very small to give problems) at the root. What this means is that a comparitively large error in $x$ near the root will result in a tiny error in $p (x)$ . This means that for $x$ close to the root $p (x)$ will be calculated very close to 0 within machine accuracy. So the decision as to whether $p (x)$ is positive, negative or zero may be incorrect.

It is difficult to know what to do in such situations. Continued halving of the interval clearly gives the impression of accuracy that is not warranted. It is probably better to stop the algorithm immediately when a "root" $x$ with $p (x) \approx 0$ (as accurately as the machine can compute) is found and report the accuracy estimate to be the length of the interval at that point in the computation (and not as the machine accuracy).

3. Code for the bisection method

Using call by reference, the bisection method can be coded nicely as a function.

// bisection.cpp (click here to download)

// A function for the bisection method
#include <iostream>

/* function p(x), where the equation to solve is p(x) = 0, 
 * declared here but defined elsewhere. */
double p(double x); 

/* Given the function p(x) as above, this function
 * performs one step of the bisection method.  Specifically
 * given xlow, xhigh with p(xlow) and p(xhigh) having different signs
 * it returns the midpoint and resets one of xlow or xhigh
 * to that midpoint so the bisection method can continue.
 * 
 * Note xlow, xhigh are call-by-reference.
 */
double bisection_step(double& xlow, double& xhigh) {
  double xmid = (xlow + xhigh) / 2.0;
  double ylow = p(xlow);
  double yhigh = p(xhigh);  
  double ymid = p(xmid);
  if ((ymid<=0.0 && yhigh>=0.0)||(ymid>=0.0 && yhigh<=0.0)) {
    xlow = xmid; 
  } else {
    if ((ymid<0.0 && ylow<0.0)||(ymid>0.0 && ylow>0.0)) {
      std::cerr << "Warning in bisection_step: bad xlow, xhigh" << std::endl;
    }
    xhigh= xmid; 
  }
  return xmid;
}

/* performs n steps of the bisection method on p starting with xvalues
 * xlow, xhigh. Note that the values xlow xhigh are protected by
 * using call-by-value here. The final error estimate is returned in the
 * call-by-reference value xerr.  */
double bisection_search(double xlow, double xhigh, int n, double& xerr) {
  double xmid = (xlow + xhigh) / 2.0;
  for (int i=0; i<n; i++) {
    xmid = bisection_step(xlow,xhigh);
  }
  xerr = xhigh - xmid;
  if (xerr<0) xerr = -xerr;
  return xmid;
}