Mascot image.
#Math#Differentiation

Newton’s Method

In Lesson 6AM we solved every “find rr from erT=M/Pe^{r T} = M/P” question by taking ln\ln of both sides, so each unknown reduced cleanly to a closed-form expression in ln\ln. The Picasso problem solved e56r=6914.29e^{56 r} = 6914.29; the doubling-time problem solved e0.05t=2e^{0.05 t} = 2; the matching-rate problem solved er=(1+0.05/12)12e^{r^{\ast}} = (1 + 0.05/12)^{12}. Every equation in that lesson admitted an exact closed-form answer because the operations involved were already invertible in the toolkit we’d built up by Lesson 5PM.

Most equations we meet in practice are not so kind. The bookkeeping equation

xex=1,(1)x \, e^{x} = 1, \tag{1}

which arises whenever the half-life of one process must equal the time constant of another, has no solution expressible using the functions in our current toolkit: +,,×,÷+, -, \times, \div, roots, ln\ln, and e()e^{(\cdot)}. The polynomial equation

x4+2x3x21=0,(2)x^{4} + 2 x^{3} - x^{2} - 1 = 0, \tag{2}

which arises in a structural-engineering load problem, does admit an exact root formula (the quartic formula), but the formula is too unwieldy for routine work. The single positive solution is approximately x0.8445x \approx 0.8445, and the closed form is several lines of nested radicals.

Both equations have the same shape: f(x)=0f(x) = 0 for an explicit, differentiable ff. Our closing application of MA0A turns the entire derivative toolkit toward a numerical method that produces such xx to as many decimals as desired, using only ff and ff'. The key is the linear approximation that the derivative supplies us at each point.

The Tangent-Line Trick

We pick any starting guess x0x_{0} and look at the tangent line to y=f(x)y = f(x) at the point (x0,f(x0))(x_{0}, f(x_{0})). By the point-slope formula from Lesson 1AM, our tangent has equation

y=f(x0)+f(x0)(xx0).(3)y = f(x_{0}) + f'(x_{0}) \, (x - x_{0}). \tag{3}

Setting y=0y = 0 in (3)(3) gives us the xx-intercept of the tangent:

0=f(x0)+f(x0)(xx0),x=x0f(x0)f(x0).(4)0 = f(x_{0}) + f'(x_{0}) \, (x - x_{0}), \qquad x = x_{0} - \frac{f(x_{0})}{f'(x_{0})}. \tag{4}

The right-hand side is defined whenever f(x0)0f'(x_{0}) \neq 0. We call it x1x_{1}.

The tangent line is the best linear approximation to y=f(x)y = f(x) near x0x_{0} that the derivative supplies us. If the curve is close to its tangent over the interval between x0x_{0} and the true root, then x1x_{1} is closer to the root than x0x_{0} was. Repeating the construction with x1x_{1} in place of x0x_{0} produces x2x_{2}, then x3x_{3}, and so on.

Definition 42 (Newton's iteration)

Let ff be a differentiable function and x0x_{0} a starting guess with f(x0)0f'(x_{0}) \neq 0. The Newton iteration for ff from x0x_{0} is the sequence x0,x1,x2,x_{0}, x_{1}, x_{2}, \dots defined by

xn+1=xnf(xn)f(xn),(5)x_{n+1} = x_{n} - \frac{f(x_{n})}{f'(x_{n})}, \tag{5}

provided f(xn)0f'(x_{n}) \neq 0 at every step.

Theorem 32 (Geometric reading of (5)(5))

xn+1x_{n+1} is the xx-intercept of the tangent line to y=f(x)y = f(x) at the point (xn,f(xn))(x_{n}, f(x_{n})).

Proof

The tangent line to y=f(x)y = f(x) at x=xnx = x_{n} has equation

y=f(xn)+f(xn)(xxn)y = f(x_{n}) + f'(x_{n}) \, (x - x_{n})

by point-slope. Setting y=0y = 0 and dividing by f(xn)0f'(x_{n}) \neq 0 gives x=xnf(xn)/f(xn)x = x_{n} - f(x_{n})/f'(x_{n}), which is (5)(5).

The curve y = xe^x - 1 drawn solid in red. A starting point (x_0, f(x_0)) at x_0 = 1 sits above the x-axis at height e - 1 ≈ 1.72. The tangent line at that point, drawn dashed in blue, slopes upward to the right and meets the x-axis at x_1 ≈ 0.684, between the starting point and the true root marked at x ≈ 0.567.

If the linear approximation is good, x1x_{1} has overshot the root by less than x0x_{0} did, and each iterate sits closer to the root than the previous one.

A Worked Example: xex=1x e^{x} = 1

We rewrite equation (1)(1) as f(x)=xex1=0f(x) = x \, e^{x} - 1 = 0. By the product rule applied to xx and exe^{x}, with the derivative ddx(ex)=ex\dfrac{d}{dx}(e^{x}) = e^{x} from Lesson 5AM,

f(x)=1ex+xex=(x+1)ex.(6)f'(x) = 1 \cdot e^{x} + x \cdot e^{x} = (x + 1) \, e^{x}. \tag{6}
Example 162 (Newton's method applied to xex=1x e^{x} = 1)

Apply Newton’s iteration from x0=1x_{0} = 1 and report the first few iterates to seven decimal places.

By (5)(5) with our explicit ff and ff' above,

xn+1=xnxnexn1(xn+1)exn.x_{n+1} = x_{n} - \frac{x_{n} \, e^{x_{n}} - 1}{(x_{n} + 1) \, e^{x_{n}}}.

At x0=1x_{0} = 1: f(1)=e1f(1) = e - 1, f(1)=2ef'(1) = 2 e, so

x1=1e12e=112+12e=12+12e0.6839397.x_{1} = 1 - \frac{e - 1}{2 e} = 1 - \frac{1}{2} + \frac{1}{2 e} = \frac{1}{2} + \frac{1}{2 e} \approx 0.6839397.

Continuing the iteration with a calculator at each step:

nnxnx_{n}
001.00000001.0000000
110.68393970.6839397
220.57745450.5774545
330.56722970.5672297
440.56714330.5671433
550.56714330.5671433

Our iterates have stabilised at x5=x4x_{5} = x_{4} to seven decimal places. The truncated value x0.5671433x \approx 0.5671433 satisfies xex1.0000000x \, e^{x} \approx 1.0000000, with f(0.5671433)2.65×108f(0.5671433) \approx 2.65 \times 10^{-8}.

After only four iterations our digits stop changing. The number of correct decimals roughly doubles at each step: our gap between x0x_{0} and the root has size 0.43\sim 0.43, between x1x_{1} and the root 0.12\sim 0.12, between x2x_{2} and x3x_{3} the digits agree only at the first decimal, and by x4x_{4} the agreement is at the seventh.

This doubling of correct decimals is the typical behaviour of Newton’s method when the starting guess is close enough to the root. A formal proof needs more of the linear-approximation theory than this lesson develops; the observation that each iteration roughly squares the previous error is enough to explain why we prefer Newton’s method to bisection in practice.

A Quartic Example

The polynomial (2)(2) has f(x)=x4+2x3x21f(x) = x^{4} + 2 x^{3} - x^{2} - 1, and by the Power and sum rules from Lesson 2PM

f(x)=4x3+6x22x.f'(x) = 4 x^{3} + 6 x^{2} - 2 x.
Example 163 (Newton's method on a quartic equation)

Apply Newton’s iteration to f(x)=x4+2x3x21=0f(x) = x^{4} + 2 x^{3} - x^{2} - 1 = 0 from the starting guess x0=2x_{0} = 2, in search of the unique positive root near x0.8445x \approx 0.8445.

A direct evaluation gives us f(2)=16+1641=27f(2) = 16 + 16 - 4 - 1 = 27 and f(2)=32+244=52f'(2) = 32 + 24 - 4 = 52, so

x1=227521.4807692.x_{1} = 2 - \frac{27}{52} \approx 1.4807692.

Continuing each step with a calculator:

nnxnx_{n}
002.0000002.000000
111.4807691.480769
221.1309771.130977
330.9300450.930045
440.8548800.854880
550.8446750.844675
660.8444990.844499

Our iterates converge on x0.8445x \approx 0.8445 after six rounds. The first few rounds spend their effort closing the gap between x0=2x_{0} = 2 and the basin of fast convergence near the root; once we are close, the digits double per round, exactly as in the previous example.

Our starting guess x0=2x_{0} = 2 is a poor choice numerically, since f(2)=27f(2) = 27 is far from zero. A starting guess closer to the root, say x0=1x_{0} = 1, would converge in fewer rounds: f(1)=1f(1) = 1, f(1)=8f'(1) = 8, x1=11/8=0.875x_{1} = 1 - 1/8 = 0.875, and a few more rounds deliver the same accuracy.

The graph of y = x^4 + 2x^3 - x^2 - 1, drawn in red, crossing the x-axis at one positive root marked at x ≈ 0.8445 and at one negative root near x ≈ -2.47. The curve dips to a minimum below zero in the middle and climbs steeply upward on both sides.

Our two examples share the same arithmetic: every iteration is one evaluation of ff, one evaluation of ff', and one division. Both use the toolkit we assembled in Lessons 4 and 5; nothing new about the operations themselves.

A Cube Root in Closed Form

The classical demonstration of Newton’s method is the computation of 23\sqrt[3]{2} to many decimals, using only ++, -, ×\times, ÷\div.

Example 164 (The cube root of 22)

Solve x3=2x^{3} = 2 by Newton’s method from x0=1.5x_{0} = 1.5, and report the iterates to seven decimal places.

We set f(x)=x32f(x) = x^{3} - 2 and f(x)=3x2f'(x) = 3 x^{2} from the power rule of Lesson 2PM. Our iteration (5)(5) becomes

xn+1=xnxn323xn2=2xn3+23xn2,x_{n+1} = x_{n} - \frac{x_{n}^{3} - 2}{3 x_{n}^{2}} = \frac{2 x_{n}}{3} + \frac{2}{3 x_{n}^{2}},

where the simplification splits xnxn/3=2xn/3x_{n} - x_{n}/3 = 2 x_{n}/3 from the first term and isolates the constant 2/(3xn2)2/(3 x_{n}^{2}) from the second.

The iterates from x0=1.5x_{0} = 1.5:

x1=1.0+26.751.2962963,x21.2609322,x31.2599219,x41.2599210.x_{1} = 1.0 + \frac{2}{6.75} \approx 1.2962963, \quad x_{2} \approx 1.2609322, \quad x_{3} \approx 1.2599219, \quad x_{4} \approx 1.2599210.

By x4x_{4} our iterates have stabilised at 231.2599210\sqrt[3]{2} \approx 1.2599210, with x4324×1015x_{4}^{3} - 2 \approx 4 \times 10^{-15} on a calculator that carries fifteen digits.

Our closed-form simplification xn+1=(2xn)/3+2/(3xn2)x_{n+1} = (2 x_{n})/3 + 2/(3 x_{n}^{2}) admits the reading “two-thirds of the previous estimate plus one-third of the exact fix”: each iterate is a weighted average that gives more weight to the previous estimate but pulls partly toward the value that would solve the equation if the curve were genuinely linear at xnx_{n}.

Problem 180

Use Newton’s method from x0=2x_{0} = 2 to compute 53\sqrt[3]{5} to six decimal places. Show the simplified iteration formula in the form

xn+1=2xn3+53xn2,x_{n+1} = \frac{2 x_{n}}{3} + \frac{5}{3 x_{n}^{2}},

and tabulate x0,x1,x2,x3,x4x_{0}, x_{1}, x_{2}, x_{3}, x_{4}.

Problem 181

For the equation f(x)=ex3x=0f(x) = e^{x} - 3 x = 0, Newton’s method from x0=1x_{0} = 1 converges to one root and from x0=2x_{0} = 2 converges to another. Compute the first three iterates from each starting point to four decimal places, and identify the two roots numerically. Use the formula for the derivative of exe^{x} from Lesson 5AM.

When the Tangent Is Horizontal

Our iteration (5)(5) involves a division by f(xn)f'(x_{n}) and breaks down whenever the tangent at xnx_{n} is horizontal.

Example 165 (A starting point where f=0f' = 0)

For f(x)=xex1f(x) = x \, e^{x} - 1, the derivative f(x)=(x+1)exf'(x) = (x + 1) e^{x} from (6)(6) vanishes at x=1x = -1, since exe^{x} never vanishes and the factor x+1x + 1 does. Starting Newton’s iteration from x0=1x_{0} = -1 produces

x1=x0f(x0)f(x0)=11/e10,x_{1} = x_{0} - \frac{f(x_{0})}{f'(x_{0})} = -1 - \frac{-1/e - 1}{0},

which is undefined. Geometrically, the tangent at (1,1/e1)(-1, -1/e - 1) is horizontal and never crosses the xx-axis.

Our remedy is a different starting guess. Any x0x_{0} with f(x0)0f'(x_{0}) \neq 0 avoids the break-down. For this ff, the only xx at which f(x)=0f'(x) = 0 is x=1x = -1; every other starting choice keeps our iteration well-defined.

A horizontal tangent is also a danger nearby, even when the starting iterate avoids the exact zero of ff'. The denominator in (5)(5) shrinks to nearly zero as xnx_{n} approaches a critical number, blowing up the size of the correction f(xn)/f(xn)f(x_{n})/f'(x_{n}) and throwing the next iterate far from where it would otherwise land.

A Starting Point That Wanders Off

Our other failure mode is more subtle: the iteration is well-defined at every step but moves away from the root. The shape of the curve far from the root determines whether this happens.

Example 166 (x0=2x_{0} = -2 for xex1x e^{x} - 1)

The same equation xex=1x \, e^{x} = 1 has its unique real root near x0.5671x \approx 0.5671. Starting Newton’s iteration from x0=2x_{0} = -2:

f(2)=2e211.2707,f(2)=(1)e20.1353,f(-2) = -2 e^{-2} - 1 \approx -1.2707, \qquad f'(-2) = (-1) e^{-2} \approx -0.1353,

so

x1=21.27070.135329.39=11.39.x_{1} = -2 - \frac{-1.2707}{-0.1353} \approx -2 - 9.39 = -11.39.

The next iterate is far further from the root than the start. Continuing,

x28516.92,x3(further still, with magnitude growing).x_{2} \approx -8516.92, \qquad x_{3} \approx \text{(further still, with magnitude growing)}.

The iterates diverge toward -\infty.

The cause is visible in our curve y=xex1y = x e^{x} - 1: as xx \to -\infty, the term xexx e^{x} shrinks to zero (the exponential decay overwhelms the linear factor), so f(x)1f(x) \to -1. The curve has the line y=1y = -1 as a horizontal asymptote on the left. Newton’s iteration, looking at the gentle slope of the asymptotic region and assuming the curve is approximately linear, projects the tangent to a faraway intercept; our next iterate lands in even gentler asymptotic territory; and the process accelerates outward instead of homing in.

A wide-view graph of y = xe^x - 1 from x = -12 to x = 1.5. The curve sits very close to the dotted horizontal asymptote y = -1 across the entire negative region, then rises steeply in the rightmost portion to cross the x-axis. The tangent line at (-2, ≈ -1.27), drawn dashed, has very gentle negative slope and crosses the x-axis far to the left at x_1 ≈ -11.39.

The shallow slope of the tangent in the asymptotic region is the cause: dividing f(xn)1f(x_{n}) \approx -1 by a small f(xn)0f'(x_{n}) \approx 0 produces a large correction. Newton’s method needs more than f(xn)0f'(x_{n}) \neq 0; it needs a starting region where the curve is not too close to a horizontal asymptote.

Remark

The two failure cases just shown sit at opposite extremes:

  • Local horizontal tangent at the iterate (f(xn)=0f'(x_{n}) = 0): division by zero, the next iterate is undefined.
  • Far-asymptotic region (f(xn)f(x_{n}) close to a horizontal asymptote): division by a tiny number, the next iterate is huge and farther from the root.

Both are symptoms of the same tension: a shallow tangent line cannot be trusted to predict where the true curve crosses zero.

A Function with No Real Root

Newton’s method assumes that f(x)=0f(x) = 0 has a solution. When it does not, our iteration must do something with each xnx_{n}, and the resulting sequence cannot converge to anything sensible.

Example 167 (f(x)=x2+1f(x) = x^{2} + 1 has no real root)

The function f(x)=x2+1f(x) = x^{2} + 1 satisfies f(x)1>0f(x) \geq 1 > 0 for every real xx, so there is no xx with f(x)=0f(x) = 0. The Newton iteration formula

xn+1=xnxn2+12xnx_{n+1} = x_{n} - \frac{x_{n}^{2} + 1}{2 x_{n}}

is, however, defined whenever xn0x_{n} \neq 0. Starting from x0=2x_{0} = 2 gives the sequence

x1=254=0.75,x20.2917,x31.5685,x40.4655,x_{1} = 2 - \frac{5}{4} = 0.75, \quad x_{2} \approx -0.2917, \quad x_{3} \approx 1.5685, \quad x_{4} \approx 0.4655,x50.8415,x60.1734,x72.7970,x_{5} \approx -0.8415, \quad x_{6} \approx 0.1734, \quad x_{7} \approx -2.7970, \quad \dots

Our iterates oscillate without settling. Some sit close to the origin, others land well outside the interval [1,1][-1, 1], and the magnitudes occasionally swing into the threes. No sub-pattern repeats.

A line plot of x_n versus n for the Newton iteration on f(x) = x^2 + 1 starting from x_0 = 2. The points oscillate between positive and negative values without converging: 2.00, 0.75, -0.29, 1.57, 0.47, -0.84, 0.17, -2.80, and so on. A label notes 'no real root' to explain the absence of convergence.

A starting guess as close to the origin as x0=1x_{0} = 1 already runs into trouble of a different kind: the next iterate is

x1=122=0,x_{1} = 1 - \frac{2}{2} = 0,

and f(0)=0f'(0) = 0 then forbids any further step. Different starting guesses produce different chaotic behaviours. The qualitative lesson: Newton’s method itself does not certify that a real solution exists; a graph or a separate sign check is still needed.

Problem 182

For the equation lnx=1/x\ln x = 1/x on x>0x > 0, define f(x)=lnx1/xf(x) = \ln x - 1/x. Compute f(x)f'(x) using the derivative of ln\ln from Lesson 5PM. Starting from x0=2x_{0} = 2, perform Newton’s method to four decimal places and report the iterates x0,x1,x2,x3x_{0}, x_{1}, x_{2}, x_{3}. Confirm the iterates are settling by comparing successive values.

Problem 183

Apply Newton’s method to f(x)=x33x+1f(x) = x^{3} - 3 x + 1 on the real line. The cubic has three real roots, near x1.879x \approx -1.879, x0.347x \approx 0.347, and x1.532x \approx 1.532.

  1. Compute f(x)f'(x) and identify the two values at which f(x)=0f'(x) = 0. These are the values to avoid as starting guesses.
  2. Starting from x0=2x_{0} = -2, x0=0x_{0} = 0, and x0=2x_{0} = 2, perform three Newton iterations from each. Confirm that each starting guess converges to the nearest of the three roots.
  3. Find a starting guess between 00 and 11 that converges to the negative root x1.879x \approx -1.879 rather than the nearby root at x0.347x \approx 0.347, and explain in one sentence why such a starting guess exists. (Hint: pick a point at which the tangent is shallow but pointing leftward.)
Problem 184

A British project manager solves for the time t>0t > 0 at which a mixed savings account, with balance

B(t)=1000e0.05t+500t,B(t) = 1000 \, e^{0.05 \, t} + 500 \, t,

first reaches £50005\,000. The equation B(t)5000=0B(t) - 5000 = 0 has no closed-form solution.

  1. Set f(t)=1000e0.05t+500t5000f(t) = 1000 \, e^{0.05 \, t} + 500 \, t - 5000. Compute f(t)f'(t) using the formula for ekxe^{k x}.
  2. Starting from t0=5t_{0} = 5, perform Newton’s iteration to four decimal places. Tabulate t0,t1,t2,t3,t4t_{0}, t_{1}, t_{2}, t_{3}, t_{4}.
  3. Verify the answer by computing BB at the converged tt and confirming it is within £0.010.01 of £50005\,000.
Problem 185

(Harder.) For f(x)=xex1f(x) = x \, e^{x} - 1 from (1)(1), the iteration (5)(5) takes the form

xn+1=xnxnexn1(xn+1)exn.x_{n+1} = x_{n} - \frac{x_{n} \, e^{x_{n}} - 1}{(x_{n} + 1) \, e^{x_{n}}}.

Show that the right side simplifies to

xn+1=xn2xn+1+exnxn+1,x_{n+1} = \frac{x_{n}^{2}}{x_{n} + 1} + \frac{e^{-x_{n}}}{x_{n} + 1},

and identify each summand. The first term comes from the difference xnxn/(xn+1)x_{n} - x_{n}/(x_{n} + 1); the second comes from the constant 1/((xn+1)exn)1/((x_{n}+1) e^{x_{n}}). Use the simplified formula to recompute x1x_{1} from x0=1x_{0} = 1 and confirm x1=1/2+1/(2e)x_{1} = 1/2 + 1/(2 e).

Problem 186

(Harder.) A function f(x)f(x) is called monotonic on an interval if it is either strictly increasing or strictly decreasing on the interval. Suppose ff is differentiable, f>0f' > 0 everywhere on an interval [a,b][a, b], and f(a)<0<f(b)f(a) < 0 < f(b), so ff has a unique root rr in (a,b)(a, b). Use the second derivative rule to argue that:

  1. If f(x)0f''(x) \geq 0 on [a,b][a, b] (so ff is concave up there), then for every starting guess x0x_{0} in [a,b][a, b] with x0>rx_{0} > r, the Newton iterates form a strictly decreasing sequence bounded below by rr.
  2. If f(x)0f''(x) \leq 0 on [a,b][a, b] (so ff is concave down there), then for every starting guess x0x_{0} in [a,b][a, b] with x0<rx_{0} < r, the Newton iterates form a strictly increasing sequence bounded above by rr.

(In each case the iterates approach the root from one side without overshoot. The general overshoot pattern of Newton’s method is therefore forbidden when the curve has the favourable concavity sign across the interval.)

Closing

Newton’s method is our last application of differentiation in this course. It uses every part of the toolkit we’ve built (the tangent-line construction from Lesson 2, the rules of differentiation from Lessons 4 and 5, the geometry of asymptotes from Lesson 3PM, and the rate-equation reading of erte^{r t} from Lesson 6AM) to turn an equation that resists algebra into a numerical procedure that converges in a handful of arithmetic steps.

Our next major question, taken up in MA0B, asks for the cumulative total of a function over an interval rather than the slope at a point. The two questions are connected, and the answer to the second turns out to involve the answer to the first in a way that justifies our entire derivative apparatus once more, but the development belongs to the integration course.

Note (Toolkit additions from this lesson)
ToolForm
Newton’s iterationxn+1=xnf(xn)/f(xn)x_{n+1} = x_{n} - f(x_{n})/f'(x_{n})
Geometric readingxn+1x_{n+1} is the xx-intercept of the tangent at xnx_{n}
Failure modesf(xn)=0f'(x_{n}) = 0, asymptotic region, no real root