Knowledge base dedicated to Linux and applied mathematics.

Home > Mathematics > Numerical solution of nonlinear equations > **Newton’s method**

All the versions of this article: <English> <français>

Newton’s method or Newton-Raphson method is an iterative numerical method used to solve f(x)=0 type equations. It relies on the fixed-point method and on a particular function, g, related to the derivative of f.

Newton’s method is a fixed-point method using the application $g$:

$$g(x)=x-\displaystyle\frac{f(x)}{f’(x)}$$

It can be easily inferred that looking for a fixed point for $g$ comes down to looking for a solution to the following equation

$$f(x)=0\quad (E).$$

Remember that, in order to look for the fixed point, we resort to an iterative algorithm defined by the sequence

$$x_{n+1}=g(x_n)$$

The numerical scheme for Newton’s method is

$$x_{n+1}=x_n-\displaystyle\frac{f(x_n)}{f’(x_n)}$$

The tangent to the curve f at the point $(x_n,f(x_n))$ has the following equation

$$y=f(x_n)+f’(x_n)(x-x_n)$$

$x_{n+1}$ is nothing less than the abscissa of the point of intersection of this tangent with the $x$-axis. Indeed

$$ \begin{array}{lcl} y&=&0\\ x&=&x_n-\displaystyle\frac{f(x_n)}{f’(x_n)} \end{array} $$

We then set $x_{n+1}$:

$$x_{n+1}=x_n-\displaystyle\frac{f(x_n)}{f’(x_n)}$$

**Theorem.**

Let $f\in\mathcal{C}^2([a,b])$ and $x_*\in]a,b[$ such that

$$f(x_*)=0$$

and

$$f’(x_*)\neq 0$$

Then, there exists $\delta>0$ such that Newton’s method converges for all $x_0\in I_\delta=[x_*-\delta,x_*-\delta]$

**Proof.**

By assumption, $f’$ is continuous and $f’(x_*)\neq 0$. Therefore, there exists $\eta>0$ such that:

$$f’(x)\neq 0 \quad \forall x \in[x_*-\eta,x_*-\eta]\subset [a,b]$$

The derivative of $g$ is defined by:

$$g’(x)=(x-\displaystyle\frac{f(x)}{f’(x)})’=\displaystyle\frac{f(x)f’’(x)}{f’(x)^2}$$

By assumption, $f(x_*)=0$ and $f’(x_*)\neq 0$. Consequently:

$$g’(x_*)=0$$

Moreover, $g’$ is continuous in $[x_*-\eta,x_*-\eta]$ since

$$f’(x)\neq 0 \quad \forall x \in[x_*-\eta,x_*-\eta]\subset [a,b]$$

Explicitly writing the continuity of $g’$ at $x_*$ in the interval $[x_*-\eta,x_*-\eta]$ yields to:

$$\forall \varepsilon >0 \quad |x-x*|\leq \eta \Longrightarrow |g’(x)-g’(x_*)|\leq\varepsilon$$

that is

$$\forall \varepsilon >0 \quad |x-x*|\leq \eta \Longrightarrow |g’(x)|\leq\varepsilon$$

Therefore, there exists $\delta<\eta$ such that

$$|g’(x)|\leq\tau,\quad\forall x\in I_\delta=[x_*-\delta,x_*-\delta],\quad\tau\in]0,1[.$$

Up to now, we have proved one of the assumptions for the fixed-point theorem. We now need to prove that the interval $l_\delta$ is $g$-invariant. That is:

$$g(I_\delta)\subset I_\delta$$

By means of the mean value theorem, we show that there exists an element

$\xi\in]x,x_*[$ such that

$$ \begin{array}{ccl} |g(x)-g(x_*)|&=&|g’(\xi)||x-x_*|\leq\tau|x-x_*|<|x-x_*|\\ |g(x)-x_*)|&<&|x-x_*| \end{array} $$

hence

$$g(I_\delta)\subset I_\delta$$

To sum up, we have proven that:

$g\in\mathcal{C}^1(I_\delta)$ et $g(I_\delta)\subset I_\delta$

there exists a constant $\tau$ in ] 0,1[ such that $|g’(x)|\leq \tau$.

Via the fixed-point theorem, we can conclude that:

the sequence defined by $x_{n+1} = g(x_n )$ converges toward $x_*$, the fixed point of $g$ $\forall x_0\in I_\delta$.

**Theorem.**

If

$g’(x_*)=0$

$g’’$ is continuous on an open set $\mathcal{O}$ containing $x_*$

$g’’(x_*)\neq 0$

then, there exists $\delta>0$ such that the sequence defined by $x_{n+1} = g(x_n )$ converges toward $x_*$, the fixed point of $g$, $\forall x_0\in I_\delta=[x_*-\delta,x_*-\delta]$. Newton’s method has a $2^{nd}$-order convergence. The convergence is **quadratic**.

**Proof.**

Similar to what we have seen previously, we show that there exists $\delta>0$and $\tau\in]0,1[$ such that

$$|g’(x)|\leq\tau,\quad\forall x \in I_\delta=[x_*-\delta,x_*-\delta] \subset\mathcal{O}$$

Since $g’(x_*)=0$, we introduce Taylor series for the function $g$ of the second order in the neighborhood of $x_*$. We obviously use the fact that $e_n=x_n-x_*$ approaches 0.

$$ \begin{array}{ccl} x_{n+1}&=&g(x_n)\\ &=&g(e_n+x_*)\\ &=&g(x_*)+e_ng’(x_*)+\displaystyle\frac{e_n^2}{2!}g’’(\xi_n)\\ &=&x_*+0+\displaystyle\frac{e_n^2}{2!}g’’(\xi_n)\\ &=&x_*+\displaystyle\frac{e_n^2}{2!}g’’(\xi_n)\\ \end{array} $$

where $\xi_n\in]x_*,e_n+x_*[=]x_*,x_n[$.

Hence:

$$e_{n+1}=x_{n+1}-x_*=\displaystyle\frac{e_n^2}{2!}g’’(\xi_n)$$

and

$$\lim_{n\to\infty}\frac{|e_{n+1}|}{|e_{n}|^2}=|\frac{g’’(x_*)}{2}|$$

The asymptotic error constant is $C=\displaystyle|\frac{g’’(x_*)}{2}|$. The convergence is quadratic; that is, of the $2^{nd}$-order since by assumption $g’’(x_*)\neq0$.

Let $p\in\mathbf{N}$. f is said to have root of multiplicity (or order) $p$ if

$$ \begin{array}{lcl} f(x_*)&=&f’(x_*)=f’’(x_*)=f^{(3)}(x_*)=\ldots=f^{(p-1)}(x_*)=0\\ f^{(p)}(x_*)&\neq& 0 \end{array} $$

and $f$ can be written as follows:

$$f(x)=(x-x_*)^ph(x)$$

where $h(x_*)\neq0$.

**Theorem.**

If $f\in\mathcal{C}^2([a,b])$ and $f$ has a simple root, then Newton’s method converges quadratically ($2^{nd}$ order), at least.

**Proof.**

If $f$ has a simple root, then:

$$ \begin{array}{lcl} f(x_*)&=&0\\ f^{’}(x_*)&\neq& 0 \end{array} $$

Newton’s method converges.

The derivative of $g$ is defined by:

$$g’(x)=(x-\displaystyle\frac{f(x)}{f’(x)})’=\displaystyle\frac{f(x)f’’(x)}{f’(x)^2}$$

The second derivative of $g$ is defined by:

$$g’’(x)=\displaystyle\frac{f’(x)^3f’’(x) + f’(x)^2f(x)f^{(3)}(x) - 2 f’(x) f’’^2(x)f(x) }{f’(x)^4}$$

and

$$g’’(x_*)=\displaystyle\frac{f’’(x_*)}{f’(x_*)}$$

If $f’’(x_*)\neq 0$

then

$g’(x_*)=0$

$g’’$ is continuous on an open set $\mathcal{O}$ containing $x_*$

$g’’(x_*)\neq0$

If $f’’(x_*)=0$, then $g’’(x_*)=0$. It is required to deepen the investigation by introducing Taylor series of the $3^{rd}$ in the neighborhood of $x_*$ of the function $g$. Obviously, $e_n$ approaches 0. This gives

$$ \begin{array}{ccl} x_{n+1}&=&g(x_n)\\ &=&g(e_n+x_*)\\ &=&g(x_*)+e_ng’(x_*)+\displaystyle\frac{e_n^2}{2!}g’’(x_*) +\displaystyle\frac{e_n^3}{3!}g^{(3)}(\xi_n)\\ &=&x_*+0+0 +\displaystyle\frac{e_n^3}{3!}g^{(3)}(\xi_n)\\ &=&x_*+\displaystyle\frac{e_n^3}{3!}g^{(3)}(\xi_n)\\ \end{array} $$

where $\xi_n\in]x_*,e_n+x_*[=]x_*,x_n[$. Hence:

$$e_{n+1}=x_{n+1}-x_*=\displaystyle\frac{e_n^3}{3!}g^{(3)}(\xi_n)$$

and

$$\lim_{n\to\infty}\frac{|e_{n+1}|}{|e_{n}|^3}=|\displaystyle\frac{g^{(3)}(x_*)}{3!}|$$

The asymptotic error constant is $C=\displaystyle|\frac{g^{(3)}(x_*)}{3!}|$ and the convergence is cubic if $g^{(3)}(x_*)\neq 0$; that is, of the $3^{rd}$ order. If $g^{(3)}(x_*)=0$, a Taylor series of the $4^{th}$-order of $g$ in the neighborhood of $x_*$ is required.

We have therefore proved that the convergence of Newton’s method is at least quadratic.

**Theorem.**

Assume that $f$ has a root of multiplicity $p$. Then, Newton’s method converges **linearly** ($1^{st}$-order) and the asymptotic error constant is:

$$C=1-\frac{1}{p}$$

**Proof.**

$f$ has a root of multiplicity $p\geq 2$. Consequently, there exists an application $h$ such that:

$$f(x)=(x-x_*)^ph(x)$$

where $h(x_*)\neq0$.

The derivative $g’$ of $g$ is defined by:

$$ \begin{array}{ccl} g’(x)=1&-&\displaystyle\frac{(ph(x)+(x-x_*)h’(x))(h(x)+ (x-x_*)h’(x))}{(ph(x)+(x-x_*)h’(x))^2}\\ &+& \displaystyle\frac{(x-x_*)h(x)(ph’(x)+h’(x)+(x-x_*)h’’(x))}{(ph(x)+(x-x_*)h’(x))^2} \end{array} $$

Thereafter:

$$g’(x_*)=1-\frac{1}{p}$$

therefore: $g’(x_*)\neq 0$ since $p\geq 2$. A $1^{st}$-order Taylor series of the $g$ in the neighborhood of $x_*$gives:

$$ \begin{array}{ccl} x_{n+1}&=&g(x_n)\\ &=&g(e_n+x_*)\\ &=&g(x_*)+e_ng’(\xi_n)\\ &=&x_*+e_ng’(\xi_n) \end{array} $$

where $\xi_n\in]x_*,e_n+x_*[=]x_*,x_n[$. Hence:

$$e_{n+1}=x_{n+1}-x_*=e_ng’(\xi_n)$$

The convergence is of the $1^{st}$-order; ie.

$$\lim_{n\to\infty}\frac{|e_{n+1}|}{|e_{n}|}=|g’(x_*)|=|1-\frac{1}{p}|=1-\frac{1}{p}$$

with an asymptotic constant

$$C=1-\frac{1}{p}.$$

We have previously seen that when the function $f$ has multiple roots, the order of convergence is deteriorated since only a linear order of convergence is attained. How can we recover a quadratic order — order 2 — of convergence when the roots have a degree of multiplicity $p\geq 2$ ? This is the aim of the modified Newton’s method.

**Definition.**

The modified Newton’s method of order $p$ is the fixed-point method of the function

$$g(x)=x-p\displaystyle\frac{f(x)}{f’(x)}$$

**Theorem.**

Assume that $f$ has a root whose multiplicity is $p$. Then, the modified Newton’s method converges at least quadratically (order 2)

**Proof.**

The derivative $g’$ of $g$ is defined as follows:

$$ \displaystyle \begin{array}{ccl} g’(x)&=&1 - p((ph(x)+(x-x_*)h’(x))(h(x)+(x-x_*)h’(x))\\ &+& (x-x_*)h(x)(ph’(x)+h’(x)+(x-x_*)h’’(x)))/((ph(x)+(x-x_*)h’(x))^2) \end{array} $$

Thereafter:

$$g’(x_*)=0$$

We thus conclude, just like before, that the modified Newton’s is at least quadratic.