Processing math: 100%
Skip to main content

Section 30.3 Newton method for minimization

If we can find the derivative fβ€² (or gradient βˆ‡f in higher dimensions), it may be possible to apply one of root-finding methods of Part II to it. For example, Newton's method was one of those. But since we are solving fβ€²=0, the method will involve the second derivative of f: it amounts to iteration of the function

g(x)=xβˆ’fβ€²(x)fβ€³(x)

Let's try this out on a simple example.

Minimize \(f(x) = x^4 - 3 x^2 + x + 1\) using the Newton method with initial point \(x_0=0\text{.}\) How does the result depend on the initial point?

Solution

First compute the derivatives \(f'(x) = 4x^3 - 6x+1\) and \(f''(x) = 12x^2 - 6\text{.}\) Then iterate the function \(g\) from (30.3.1) as in Example 8.2.1. Most of the code is copied from there.

f = @(x) x.^4 - 3*x.^2 + x + 1;
fp = @(x) 4*x.^3 - 6*x + 1;
fpp = @(x) 12*x.^2 - 6;

g = @(x) x - fp(x)/fpp(x);
x0 = 0;
max_tries = 1000;
for k = 1:max_tries
    x1 = g(x0);
    if abs(x1-x0) < 100*eps(x0)
        break
    end
    x0 = x1;
end
if k < max_tries
    fprintf('Found x = %g with f(x) = %g after %d steps\n', x1, f(x1), k);
else
    disp('Failed to converge')
end

The result is obtained quickly: β€œFound x = 0.169938 with f(x) = 1.08414 after 5 steps”. But is this a minimum? Try

t = linspace(-2, 2, 1000);
plot(t, f(t))

to check. Also, note that looking at a graph like this is essentially a human-assisted brute-force search, since the function was evaluated on a uniform grid through the interval \([-2, 2]\text{.}\)

Do we get a better result with a different starting point?

The issue demonstrated by Example 30.3.1 is that applying Newton's method (or another root-finding method) to fβ€² is equally likely to lead to a maximum as to a minimum. However, this can be useful when we know that a function has just one critical point and that point is a minimum.