Skip to main content

Section 33.3 Reflection-contraction-expansion Nelder-Mead method

The reason for the lack of success with Rosenbrock's function is that the version of Nelder-Mead method described in Section 33.2 does not allow the triangle to change its shape to fit the geometry of the graph. A long narrow valley calls for a long narrow triangle.

The process of changing the shape of the triangle is expansion, which is introduced as an alternative to reflection from Section 33.1. Suppose that the current triangle qualifies for reflection: for example, \(C\) is the point with largest value, and \(f(R) \lt f(C)\text{.}\) Consider the possibility of replacing \(C\) with a point \(E\) that lies beyond \(R\text{,}\) thus making the triangle longer. The point \(E\) is determined by the following equation, which shows that it is as far from \(R\) as \(R\) is from the center of reflection \(M\text{:}\)

\begin{equation*} E = R + (R-M) = 2R - M \end{equation*}

If \(f(E) \lt f(R)\text{,}\) then we replace \(C\) with \(E\) rather than \(R\text{.}\) In other words, the “highest” vertex of the triangle is replaced by whichever of \(E, R\) is lower.

Having both contraction and expansion in the process means the triangle can grow smaller and larger, moving faster when the path is clear, or becoming very narrow to fit into a difficult to reach valley.

Minimize Rosenbrock's function (32.1.2) using the reflection-contraction-expansion version of the Nelder-Mead method. Use a random starting point randn(2, 1) as one vertex \(A\) of the initial triangle, and let other vertices be \(A + \mathbf e_1\text{,}\) \(A+\mathbf e_2\text{.}\) Plot the path of the search.

Solution

The code is mostly the same as in Example 33.2.1, with additional lines noted in comments.

f = @(x) (x(1)-1)^2 + 100*(x(1)^2 - x(2))^2;

A = randn(2, 1);
B = A + [1; 0];
C = A + [0; 1];
T = [A B C];
max_tries = 10000;
path = zeros(2, max_tries);

for k = 1:max_tries
    path(:, k) = mean(T, 2);
    if max(abs(T - mean(T, 2))) < 1e-6
        break
    end
    values = [f(T(:,1)) f(T(:,2)) f(T(:,3))];
    [fmax, ind] = max(values);
    M = (sum(T, 2) - T(:, ind))/2;
    R = 2*M - T(:, ind);
    if f(R) < fmax
        E = 2*R - M;                    % consider expansion
        if f(E) < f(R)
            T(:, ind) = E;              % choose to expand
        else
            T(:, ind) = R;              % choose to reflect
        end
    else 
        [fmin, ind] = min(values);
        T = (T + T(:, ind))/2;
    end
end

plot(path(1, 1:k), path(2, 1:k), '-+')
if k < max_tries
    x = mean(T, 2);
    fprintf('Found x = (%g, %g) with f(x) = %g after %d steps\n', x, f(x), k);
else
    disp('Failed to converge')
end

The algorithm described above works quite well, considering we are minimizing a challenging function without using any derivative information. The version of the Nelder-Mead algorithm implemented in Matlab is a little different in that it chooses between three ways of contracting a simplex (called “contract inside”, “contract outside”, “shrink”) but the main ideas are the same. To observe the process of fminsearch optimization, run

f = @(x) (x(1)-1)^2 + 100*(x(1)^2 - x(2))^2;
options = optimset('Display', 'iter');
fminsearch(f, randn(2, 1), options)