Section 30.3 Newton method for minimization
If we can find the derivative \(f'\) (or gradient \(\nabla f\) in higher dimensions), it may be possible to apply one of root-finding methods of Part II to it. For example, Newton's method was one of those. But since we are solving \(f'=0\text{,}\) the method will involve the second derivative of \(f\text{:}\) it amounts to iteration of the function
Let's try this out on a simple example.
Example 30.3.1. Using Newton method for minimization.
Minimize \(f(x) = x^4 - 3 x^2 + x + 1\) using the Newton method with initial point \(x_0=0\text{.}\) How does the result depend on the initial point?
First compute the derivatives \(f'(x) = 4x^3 - 6x+1\) and \(f''(x) = 12x^2 - 6\text{.}\) Then iterate the function \(g\) from (30.3.1) as in Example 8.2.1. Most of the code is copied from there.
f = @(x) x.^4 - 3*x.^2 + x + 1; fp = @(x) 4*x.^3 - 6*x + 1; fpp = @(x) 12*x.^2 - 6; g = @(x) x - fp(x)/fpp(x); x0 = 0; max_tries = 1000; for k = 1:max_tries x1 = g(x0); if abs(x1-x0) < 100*eps(x0) break end x0 = x1; end if k < max_tries fprintf('Found x = %g with f(x) = %g after %d steps\n', x1, f(x1), k); else disp('Failed to converge') end
The result is obtained quickly: “Found x = 0.169938 with f(x) = 1.08414 after 5 steps”. But is this a minimum? Try
t = linspace(-2, 2, 1000); plot(t, f(t))
to check. Also, note that looking at a graph like this is essentially a human-assisted brute-force search, since the function was evaluated on a uniform grid through the interval \([-2, 2]\text{.}\)
Do we get a better result with a different starting point?
The issue demonstrated by Example 30.3.1 is that applying Newton's method (or another root-finding method) to \(f'\) is equally likely to lead to a maximum as to a minimum. However, this can be useful when we know that a function has just one critical point and that point is a minimum.