The Newton-Raphson method: roots of a cubic

This web page explains the Newton-Raphson method, also called Newton's method, for the same problem of finding roots of a cubic. The equation to be solved is X 3 + a X 2 + b X + c = 0 .

1. Newton-Raphson method

Newton-Raphson.

Let f ( X ) be a continuous differentiable function of X . Suppose we have a value x n which is an approximate root x of f ( X ) . Then x n + 1 = x n - f ( x n ) f ( x n ) is usually a better approximation, where f ( x ) is the derivative of f ( x ) .

Proof.

We approximate the function y = f ( x ) by a straight line through ( x n , f ( x n ) ) of gradient f ( x n ) . This straight line crosses the x axis at x = x n - f ( x n ) f ( x n ) .

More specifically, the equation of the straight line is y - f ( x n ) = f ( x n ) ( x - x n ) in which you set y = 0 and solve for x to get the value x n - f ( x n ) f ( x n ) .

When it works, the Newton-Raphson method is of a higher order of convergence. (It is usually of order n 2 , meaning the error in x n + 1 is proportional to the square of the error in x n . For good starting values, the resulting convergence can be very fast.

Reason for quadratic convergence of the Newton-Raphson method.

We assume we are finding a root x of an equation f ( X ) = 0 . Assume derivatives of f exist, and f , f are sufficiently smooth to be approximated by power series up to a fixed number of terms. We suppose the Newton-Raphson approximates to x are x n where x n + 1 = x n - f ( x n ) f ( x n ) . Assume the errors are given by e n = x n - x . So subtracting x from the previous equation we have

e n + 1 = e n - f ( x n ) f ( x n )

We also assume that f ( x ) 0 . Note also that f ( x ) = 0 since x is a root.

Provided the method gives errors that are sufficiently small, the values f ( x n ) and f ( x n ) can be approximated by

f ( x n ) = f ( x + e n ) = e n f ( x ) + 1 2 e n 2 f ( x ) + f ( x n ) = f ( x + e n ) = f ( x ) + e n f ( x ) +

Inserting these estimates in the equation for e n , and simplifying (putting over a comon denominator!), we have,

e n + 1 = e n - f ( x n ) f ( x n ) = 1 2 e n 2 f ( x ) f ( x ) 1 + e n f ( x ) f ( x ) = 1 2 e n 2 f ( x ) f ( x ) . ( 1 - e n f ( x ) f ( x ) + )

So to first degree terms in e n , e n + 1 = λ e n 2 where λ = 1 2 f ( x ) f ( x ) .

N.B.

Note that f ( x ) 0 was essential for the above argument. If f ( x ) = 0 i.e. the curve y = f ( x ) touches the x -axis at the root, then Newton-Raphson may not converge so fast, indeed may not converge at all.

One of the main disadvantages of the method is that for some functions it can be difficult to get an expression for the derivative. However in the case of polynomials, the derivative is not difficult to compute.

Here is the method implemented in C++ for our problem of cubics.

// cubicrootnr.cpp (click here to download)

// solves cubics using Newton-Raphson
#include <iostream>
#include <cmath>

using namespace std;

double a,b,c; // coefficients, accessible from main and p()

double p(double x) { 
  return c + x*(b + x*(a + x));
}

double pprime(double x) { 
  return b + x*(2*a + 3*x);
}

#include "bisection.cpp"

int main() {
  cout.precision(16); // maximum precision for double, for illustration only

  cout << "This program solves the equation p(X)=0 where p(X) = X^3 + aX^2 + bX + c." << endl;
  cout << "Input the values for a, b, c." << endl;
  cin >> a >> b >> c;
  cout << "The polynomial is p(X) = X^3 + " << a << "X^2 + " << b << "X + " << c << endl;

  // bracket the roots, as before
  double xlow = -1.0;
  if (xlow>=-3.0*a) xlow = -3.0*a;
  if (xlow>=-3.0*b) xlow = -3.0*b;
  if (xlow>=-3.0*c) xlow = -3.0*c;
  double xhigh = 1.0;
  if (xhigh<=-3.0*a) xhigh = -3.0*a;
  if (xhigh<=-3.0*b) xhigh = -3.0*b;
  if (xhigh<=-3.0*c) xhigh = -3.0*c;
  if (p(xlow) > 0.0 || p(xhigh) < 0.0) {
    cout << "Oops! bound estimate doesn't work (this shouldn't happen)" << endl;
    return 1;
  }
  double x0 = bisection_step(xlow,xhigh); // improve initial guess
  cout << "Initial guess: " << x0 <<endl;

  double rel_error_required = 1.0E-10;
  double xerr, xrelerr, xnew;
  int n = 0;
  while (true) {
    xnew = x0 - p(x0) / pprime(x0) ;
    xerr = abs(xnew - x0);
    xrelerr = abs ( xerr / xnew ); 
    n++;
    if (xrelerr < rel_error_required || n>100) break;
    cout << "Approximation: " << xnew << ", estimated accuracy pm " << xerr <<endl;
    x0 = xnew;
  }

  cout << "A root is " << xnew << " estimated to plus or minus " << xerr 
       << " using " << n << " iterations" << endl;
  return 0;
}

The following programming points are worth mentioning:

The new program works well in many cases.

Example.

This program solves the equation p(X)=0 where p(X) = X^3 + aX^2 + bX + c.
Input the values for a, b, c.
1 1000 1000000
The polynomial is p(X) = X^3 + 1X^2 + 1000X + 1000000
Initial guess: -1499999.5
Approximation: -999999.777629827, estimated accuracy pm 499999.722370173
Approximation: -666666.6293091808, estimated accuracy pm 333333.1483206463
Approximation: -444444.5303180923, estimated accuracy pm 222222.0989910885
Approximation: -296296.4641583593, estimated accuracy pm 148148.066159733
Approximation: -197531.0864707292, estimated accuracy pm 98765.37768763013
Approximation: -131687.5009755118, estimated accuracy pm 65843.58549521744
Approximation: -87791.77676039582, estimated accuracy pm 43895.72421511594
Approximation: -58527.95979753511, estimated accuracy pm 29263.81696286071
Approximation: -39018.74727780055, estimated accuracy pm 19509.21251973456
Approximation: -26012.60382176877, estimated accuracy pm 13006.14345603178
Approximation: -17341.83894463253, estimated accuracy pm 8670.764877136236
Approximation: -11561.32537206546, estimated accuracy pm 5780.51357256707
Approximation: -7707.644637001054, estimated accuracy pm 3853.680735064407
Approximation: -5138.517655753345, estimated accuracy pm 2569.126981247709
Approximation: -3425.758935516163, estimated accuracy pm 1712.758720237182
Approximation: -2283.913948897836, estimated accuracy pm 1141.844986618326
Approximation: -1522.687031948788, estimated accuracy pm 761.2269169490482
Approximation: -1015.233656697703, estimated accuracy pm 507.4533752510847
Approximation: -677.0381380333156, estimated accuracy pm 338.1955186643877
Approximation: -451.8689718034016, estimated accuracy pm 225.169166229914
Approximation: -302.4976129063189, estimated accuracy pm 149.3713588970826
Approximation: -204.6800193203966, estimated accuracy pm 97.81759358592234
Approximation: -143.4026993015503, estimated accuracy pm 61.2773200188463
Approximation: -110.2037624937084, estimated accuracy pm 33.19893680784187
Approximation: -98.47519710028186, estimated accuracy pm 11.72856539342655
Approximation: -97.01249435563255, estimated accuracy pm 1.462702744649306
Approximation: -96.99091072735219, estimated accuracy pm 0.02158362828036786
Approximation: -96.99090607301694, estimated accuracy pm 4.654335242548768e-06
A root is -96.99090607301673 estimated to plus or minus 2.131628207280301e-13 using 29 iterations

Note that convergence is comparitively slow until the very end. (The estimated error decreases by worse than the characteristic factor of 2 for the bisection method, except at the very end.)

Unfortunately, the program does not work so well on all inputs.

Example.

This program solves the equation p(X)=0 where p(X) = X^3 + aX^2 + bX + c.
Input the values for a, b, c.
-0.7168 -0.486198 -0.651493
The polynomial is p(X) = X^3 + -0.7168X^2 + -0.486198X + -0.651493
Initial guess: 0.5751999999999999
Approximation: -2.497966140430039, estimated accuracy pm 3.073166140430039
Approximation: -1.604216834255157, estimated accuracy pm 0.8937493061748825
Approximation: -0.9911894929122602, estimated accuracy pm 0.6130273413428964
Approximation: -0.5152654662310612, estimated accuracy pm 0.475924026681199
Approximation: 0.1788202694675227, estimated accuracy pm 0.6940857356985839
Approximation: -0.9897677855731934, estimated accuracy pm 1.168588055040716
Approximation: -0.5139782516163409, estimated accuracy pm 0.4757895339568525
Approximation: 0.1826890651563653, estimated accuracy pm 0.6966673167727062
Approximation: -0.9873282663698353, estimated accuracy pm 1.170017331526201
Approximation: -0.5117656950143084, estimated accuracy pm 0.4755625713555269
Approximation: 0.1894081467669582, estimated accuracy pm 0.7011738417812665
Approximation: -0.9834804205142984, estimated accuracy pm 1.172888567281257
Approximation: -0.5082659292432343, estimated accuracy pm 0.4752144912710641
Approximation: 0.2002192194043703, estimated accuracy pm 0.7084851486476046
Approximation: -0.978317067502625, estimated accuracy pm 1.178536286906995
Approximation: -0.5035502843648257, estimated accuracy pm 0.4747667831377993
Approximation: 0.2151545749727722, estimated accuracy pm 0.718704859337598
Approximation: -0.9732555052469228, estimated accuracy pm 1.188410080219695
Approximation: -0.4989056060832063, estimated accuracy pm 0.4743498991637166
Approximation: 0.2302985397070159, estimated accuracy pm 0.7292041457902222
Approximation: -0.9705784423359909, estimated accuracy pm 1.200876982043007
Approximation: -0.4964400594662029, estimated accuracy pm 0.474138382869788
Approximation: 0.2385200523274133, estimated accuracy pm 0.7349601117936162
Approximation: -0.9701698499809128, estimated accuracy pm 1.208689902308326
Approximation: -0.4960631975173715, estimated accuracy pm 0.4741066524635413
Approximation: 0.2397881966326724, estimated accuracy pm 0.735851394150044
Approximation: -0.9701729416336435, estimated accuracy pm 1.209961138266316
Approximation: -0.4960660496310823, estimated accuracy pm 0.4741068920025612
Approximation: 0.239778587689879, estimated accuracy pm 0.7358446373209613
Approximation: -0.9701728514403916, estimated accuracy pm 1.209951439130271
Approximation: -0.4960659664260577, estimated accuracy pm 0.4741068850143339
Approximation: 0.2397788680101177, estimated accuracy pm 0.7358448344361753
Approximation: -0.9701728540571382, estimated accuracy pm 1.209951722067256
Approximation: -0.4960659688400572, estimated accuracy pm 0.474106885217081
Approximation: 0.2397788598772784, estimated accuracy pm 0.7358448287173356
Approximation: -0.970172853981207, estimated accuracy pm 1.209951713858485
Approximation: -0.4960659687700093, estimated accuracy pm 0.4741068852111977
Approximation: 0.239778860113272, estimated accuracy pm 0.7358448288832813
Approximation: -0.9701728539834105, estimated accuracy pm 1.209951714096682
Approximation: -0.4960659687720418, estimated accuracy pm 0.4741068852113686
Approximation: 0.2397788601064243, estimated accuracy pm 0.7358448288784661
Approximation: -0.9701728539833463, estimated accuracy pm 1.209951714089771
Approximation: -0.4960659687719829, estimated accuracy pm 0.4741068852113634
Approximation: 0.2397788601066228, estimated accuracy pm 0.7358448288786057
Approximation: -0.9701728539833483, estimated accuracy pm 1.209951714089971
Approximation: -0.4960659687719846, estimated accuracy pm 0.4741068852113637
Approximation: 0.2397788601066171, estimated accuracy pm 0.7358448288786017
Approximation: -0.9701728539833483, estimated accuracy pm 1.209951714089965
Approximation: -0.4960659687719846, estimated accuracy pm 0.4741068852113637
Approximation: 0.2397788601066171, estimated accuracy pm 0.7358448288786017
Approximation: -0.9701728539833483, estimated accuracy pm 1.209951714089965
Approximation: -0.4960659687719846, estimated accuracy pm 0.4741068852113637
Approximation: 0.2397788601066171, estimated accuracy pm 0.7358448288786017
Approximation: -0.9701728539833483, estimated accuracy pm 1.209951714089965
Approximation: -0.4960659687719846, estimated accuracy pm 0.4741068852113637
Approximation: 0.2397788601066171, estimated accuracy pm 0.7358448288786017
Approximation: -0.9701728539833483, estimated accuracy pm 1.209951714089965
Approximation: -0.4960659687719846, estimated accuracy pm 0.4741068852113637
Approximation: 0.2397788601066171, estimated accuracy pm 0.7358448288786017
Approximation: -0.9701728539833483, estimated accuracy pm 1.209951714089965
Approximation: -0.4960659687719846, estimated accuracy pm 0.4741068852113637
Approximation: 0.2397788601066171, estimated accuracy pm 0.7358448288786017
Approximation: -0.9701728539833483, estimated accuracy pm 1.209951714089965
Approximation: -0.4960659687719846, estimated accuracy pm 0.4741068852113637
Approximation: 0.2397788601066171, estimated accuracy pm 0.7358448288786017
Approximation: -0.9701728539833483, estimated accuracy pm 1.209951714089965
Approximation: -0.4960659687719846, estimated accuracy pm 0.4741068852113637
Approximation: 0.2397788601066171, estimated accuracy pm 0.7358448288786017
Approximation: -0.9701728539833483, estimated accuracy pm 1.209951714089965
Approximation: -0.4960659687719846, estimated accuracy pm 0.4741068852113637
Approximation: 0.2397788601066171, estimated accuracy pm 0.7358448288786017
Approximation: -0.9701728539833483, estimated accuracy pm 1.209951714089965
Approximation: -0.4960659687719846, estimated accuracy pm 0.4741068852113637
Approximation: 0.2397788601066171, estimated accuracy pm 0.7358448288786017
Approximation: -0.9701728539833483, estimated accuracy pm 1.209951714089965
Approximation: -0.4960659687719846, estimated accuracy pm 0.4741068852113637
Approximation: 0.2397788601066171, estimated accuracy pm 0.7358448288786017
Approximation: -0.9701728539833483, estimated accuracy pm 1.209951714089965
Approximation: -0.4960659687719846, estimated accuracy pm 0.4741068852113637
Approximation: 0.2397788601066171, estimated accuracy pm 0.7358448288786017
Approximation: -0.9701728539833483, estimated accuracy pm 1.209951714089965
Approximation: -0.4960659687719846, estimated accuracy pm 0.4741068852113637
Approximation: 0.2397788601066171, estimated accuracy pm 0.7358448288786017
Approximation: -0.9701728539833483, estimated accuracy pm 1.209951714089965
Approximation: -0.4960659687719846, estimated accuracy pm 0.4741068852113637
Approximation: 0.2397788601066171, estimated accuracy pm 0.7358448288786017
Approximation: -0.9701728539833483, estimated accuracy pm 1.209951714089965
Approximation: -0.4960659687719846, estimated accuracy pm 0.4741068852113637
Approximation: 0.2397788601066171, estimated accuracy pm 0.7358448288786017
Approximation: -0.9701728539833483, estimated accuracy pm 1.209951714089965
Approximation: -0.4960659687719846, estimated accuracy pm 0.4741068852113637
Approximation: 0.2397788601066171, estimated accuracy pm 0.7358448288786017
Approximation: -0.9701728539833483, estimated accuracy pm 1.209951714089965
Approximation: -0.4960659687719846, estimated accuracy pm 0.4741068852113637
Approximation: 0.2397788601066171, estimated accuracy pm 0.7358448288786017
Approximation: -0.9701728539833483, estimated accuracy pm 1.209951714089965
Approximation: -0.4960659687719846, estimated accuracy pm 0.4741068852113637
Approximation: 0.2397788601066171, estimated accuracy pm 0.7358448288786017
Approximation: -0.9701728539833483, estimated accuracy pm 1.209951714089965
Approximation: -0.4960659687719846, estimated accuracy pm 0.4741068852113637
A root is 0.2397788601066171 estimated to plus or minus 0.7358448288786017 using 101 iterations

Even worse,

Example.

This program solves the equation p(X)=0 where p(X) = X^3 + aX^2 + bX + c.
Input the values for a, b, c.
1 0 -1
The polynomial is p(X) = X^3 + 1X^2 + 0X + -1
Initial guess: 0
Approximation: inf, estimated accuracy pm inf
Approximation: -nan, estimated accuracy pm nan
Approximation: -nan, estimated accuracy pm nan
...

Try and see if you can work out why this happens.

2. Criticism and improvements

Experimentation with cubicroot and cubicrootnr will show that cubicrootnr vastly outperforms the bisection method in some circumstances, including when the known guess of the root is good is enough, but the convergence rate is often poor when the guess is not good, and in some circumstances Newton-Raphson does not converge at all.

This situation is typical. In some cases, Newton-Raphson performs well, in other cases it is poor and in still other cases it is worse than useless. On the other hand, the bisection method carries with it some guarantees and error estimates, so is not to be sneezed at. The best algorithm, it seems to me, is to try Newton-Raphson to obtain fast convergence, but fall back to the bisection method in all cases where Newton-Raphson might not be trustworthy.

The resulting improved code follows. It contains a revised version of the bisection function, as well as trying both methods each step and choosing the best one based on:

// cubicrootnrb.cpp (click here to download)

// solves cubics using Newton-Raphson and Bisection
#include <iostream>
#include <cmath>

using namespace std;

double a,b,c; // coefficients, accessible from main and p()

double p(double x) { 
  return c + x*(b + x*(a + x));
}

double pprime(double x) { 
  return b + x*(2*a + 3*x);
}

#include "bisection.cpp"

int main() {
  cout.precision(16); // maximum precision for double, for illustration only

  cout << "This program solves the equation p(X)=0 where p(X) = X^3 + aX^2 + bX + c." << endl;
  cout << "Input the values for a, b, c." << endl;
  cin >> a >> b >> c;
  cout << "The polynomial is p(X) = X^3 + " << a << "X^2 + " << b << "X + " << c << endl;

  // bracket the roots, as before
  double xlow = -1.0;
  if (xlow>=-3.0*a) xlow = -3.0*a;
  if (xlow>=-3.0*b) xlow = -3.0*b;
  if (xlow>=-3.0*c) xlow = -3.0*c;
  double xhigh = 1.0;
  if (xhigh<=-3.0*a) xhigh = -3.0*a;
  if (xhigh<=-3.0*b) xhigh = -3.0*b;
  if (xhigh<=-3.0*c) xhigh = -3.0*c;
  double yhigh = p(xhigh);
  double ylow = p(xlow);
  if (ylow > 0.0 || yhigh < 0.0) { // double check for safety
    cout << "Oops! bound estimate doesn't work (this shouldn't happen)" << endl;
    return 1;
  }

  double x0 = bisection_step(xlow,xhigh); // initial guess, updates xlow,xhigh

  double rel_error_required = 1.0E-10;
  double xerr, xrelerr, xnew;
  int n = 0;
  while (true) {
    bool bisect = false;
    double xnr,xbi;
    xnr = x0 - p(x0) / pprime(x0) ;
    xbi = bisection_step(xlow,xhigh); // do one step only, updates xlow,xhigh
    if (xnr <= xlow || xnr >= xhigh || abs(p(xbi))<abs(p(xnr))) {
      xnew = xbi;
      bisect = true;
    }
    else {
      xnew = xnr;
      bisect = false;
    }
    xerr = abs(xnew - x0);
    xrelerr = abs ( xerr / xnew ); 
    n++;
    if (xrelerr < rel_error_required || n>100) break;
    if (bisect) cout << "Bisection:      "; else cout << "Newton-Raphson: ";
      cout <<xnew << ", estimated accuracy pm " << xerr <<endl;
    x0 = xnew;
  }

  cout << "A root is " << xnew << " estimated to plus or minus " << xerr 
       << " using " << n << " iterations" << endl;
  return 0;
}

Here is an example run. It shows (as is typical) that the bisection method is to be preferred until the estimate for the root is good enough. What "good enough" means is not obvious, but as usual it is probably best to get the computer to do the work to decide, rather than make an arbitrary guess (such as "run bisection 10 times first") which will certainly fail for some data.

Example.

This program solves the equation p(X)=0 where p(X) = X^3 + aX^2 + bX + c.
Input the values for a, b, c.
1 1000 1000000
The polynomial is p(X) = X^3 + 1X^2 + 1000X + 1000000
Bisection:      -749999.25, estimated accuracy pm 750000.25
Bisection:      -374999.125, estimated accuracy pm 375000.125
Bisection:      -187499.0625, estimated accuracy pm 187500.0625
Bisection:      -93749.03125, estimated accuracy pm 93750.03125
Bisection:      -46874.015625, estimated accuracy pm 46875.015625
Bisection:      -23436.5078125, estimated accuracy pm 23437.5078125
Bisection:      -11717.75390625, estimated accuracy pm 11718.75390625
Bisection:      -5858.376953125, estimated accuracy pm 5859.376953125
Bisection:      -2928.6884765625, estimated accuracy pm 2929.6884765625
Bisection:      -1463.84423828125, estimated accuracy pm 1464.84423828125
Bisection:      -731.422119140625, estimated accuracy pm 732.422119140625
Bisection:      -365.2110595703125, estimated accuracy pm 366.2110595703125
Bisection:      -182.1055297851562, estimated accuracy pm 183.1055297851562
Bisection:      -90.55276489257812, estimated accuracy pm 91.55276489257812
Newton-Raphson: -97.44276759362613, estimated accuracy pm 6.890002701048004
Newton-Raphson: -96.9929337280189, estimated accuracy pm 0.449833865607232
Newton-Raphson: -96.99090611408636, estimated accuracy pm 0.002027613932540362
Newton-Raphson: -96.99090607301673, estimated accuracy pm 4.106962592231866e-08
A root is -96.99090607301673 estimated to plus or minus 0 using 19 iterations

The following data was a problem for straight Newton Raphson.

Example.

This program solves the equation p(X)=0 where p(X) = X^3 + aX^2 + bX + c.
Input the values for a, b, c.
-0.7168 -0.486198 -0.651493
The polynomial is p(X) = X^3 + -0.7168X^2 + -0.486198X + -0.651493
Bisection:      1.3628, estimated accuracy pm 0.7876000000000001
Newton-Raphson: 1.399302448524656, estimated accuracy pm 0.03650244852465612
Newton-Raphson: 1.397959697923313, estimated accuracy pm 0.001342750601343035
Newton-Raphson: 1.397957837632798, estimated accuracy pm 1.860290515409702e-06
A root is 1.39795783762923 estimated to plus or minus 3.568034756540328e-12 using 5 iterations

Note the new algorithm uses the bisection method to avoid the places of instability for Newton Raphson.

Here was the other tricky case,

Example.

This program solves the equation p(X)=0 where p(X) = X^3 + aX^2 + bX + c.
Input the values for a, b, c.
1 0 -1
The polynomial is p(X) = X^3 + 1X^2 + 0X + -1
Bisection:      1.5, estimated accuracy pm 1.5
Bisection:      0.75, estimated accuracy pm 0.75
Newton-Raphson: 0.7549019607843137, estimated accuracy pm 0.004901960784313708
Newton-Raphson: 0.7548776668452124, estimated accuracy pm 2.429393910130528e-05
Newton-Raphson: 0.7548776662466927, estimated accuracy pm 5.985196782631874e-10
A root is 0.7548776662466928 estimated to plus or minus 1.110223024625157e-16 using 6 iterations

Exercise.

Determine the behaviour of the two programs on this web page with the example -3 3 -1 that caused problems for the bisection method. Explain.

3. Conclusions