Section 33.2 Reflection-contraction combination
One issue with the reflection-only method of Section 33.1 is the fixed size of the triangle. With a large triangle, we will only have a rough idea of where the minimum is when the search ends. Also, a large triangle will not detect any fine features of the landscape such as the narrow valley of Rosenbrock's function. On the other hand, a small triangle necessarily moves in small steps and is therefore slow.
There is a way to improve the situation: instead of stopping when reflection does not work, reduce the size of triangle in half, contracting it toward the “best” vertex (the one is the with lowest value). The contraction map is \(f(x) = (x+b)/2\) where \(b\) is the vertex toward which we contract. This reflection-contraction method needs a stopping criterion, so it does not run forever. We can stop when the triangle becomes very small, with vertices getting close to one another.
Example 33.2.1. Reflection-contraction Nelder-Mead method.
Minimize the function (33.1.2) using the reflection-contraction version of the Nelder-Mead method. Use a random starting point randn(2, 1)
as one vertex \(A\) of the initial triangle, and let other vertices be \(A + \mathbf e_1\text{,}\) \(A+\mathbf e_2\text{,}\) so that the initial triangle is not small anymore. Plot the path of the search.
The code is mostly the same as in Example 33.1.1, with additional lines noted in comments.
f = @(x) x(1)^4 + x(2)^2 - sin(x(1)+x(2)); % f = @(x) (x(1)-1)^2 + 100*(x(1)^2 - x(2))^2; A = randn(2, 1); B = A + [1; 0]; C = A + [0; 1]; T = [A B C]; max_tries = 10000; path = zeros(2, max_tries); for k = 1:max_tries path(:, k) = mean(T, 2); if max(abs(T - mean(T, 2))) < 1e-6 break % stop, the triangle is small enough end values = [f(T(:,1)) f(T(:,2)) f(T(:,3))]; [fmax, ind] = max(values); M = (sum(T, 2) - T(:, ind))/2; R = 2*M - T(:, ind); if f(R) < fmax T(:, ind) = R; else [fmin, ind] = min(values); % find the best vertex T = (T + T(:, ind))/2; % contract toward it end end plot(path(1, 1:k), path(2, 1:k), '-+') if k < max_tries x = mean(T, 2); fprintf('Found x = (%g, %g) with f(x) = %g after %d steps\n', x, f(x), k); else disp('Failed to converge') end
Example 33.2.1 works fine in terms of speed and accuracy. However, replacing the relatively simple function \(f\) in Example 33.1.1 with Rosenbrock's function (32.1.2)
f = @(x) (x(1)-1)^2 + 100*(x(1)^2 - x(2))^2
we find the search usually fails. The moving triangle has to become very small to fit into the “narrow valley” of this function, which makes subsequent movement along the valley floor very slow. As as the valley is long, it fails to reach the minimum within the allowed number of steps.