Skip to main content

Section 29.2 Using fminsearch for curve-fitting

The syntax of fminsearch is similar to fsolve (which searchers for solutions \(f=0\)): the first argument is the function to be minimized, the second is initial point from which to start the search. For example,

fminsearch(@(x) x^2 + x, 0)

returns -0.5 which is where the function is minimal. The choice of initial point does not matter here, since the minimum is unique. But for the second example

fminsearch(@(x) x^4 - x^2, 0)

the initial point matters: the function may find the minimum 0.7071 or -0.7071. Note that fminsearch is not misled by \(x=0\) which is also a critical point: it can tell a difference between minimum and maximum.

As most numerical methods, fminsearch may fail at its task:

fminsearch(@(x) exp(x), 0)  

returns “-786.4315” (there are no points of minimum, of course), while

fminsearch(@(x) x^3 - x, 4)

overlooks the local minimum at \(x = 1/\sqrt{3}\) and diverges to negative infinity. The success is much more likely if the initial point is chosen with some care. In case of fitting a curve to data, one should try to guess the signs and approximate sizes of the parameters.

Since our models generally have more than one parameter, we need multivariable fminsearch. For example,

fminsearch(@(b) b(1)^2 + b(2)^4 + 3*b(1) - b(2), [1; 2])

finds the minimum of \(b_1^2 + b_2^4 + 3b_1 - b_2\) (using letter b to shorten the formula, as usually our parameters are called \(\beta_1, \beta_2,\dots\text{.}\)) Note that the function being minimized still has only one argument, but this argument is now a vector.

Since the choice of initial point is an important step, we need a way to choose reasonable values of the parameters. To do this, let us take a closer look at the logistic curve (29.1.2). The meaning of parameter \(A\) is not very intuitive, which motivates the following equivalent form of the function:

\begin{equation} y = \frac{M}{1 + e^{-(x-a)/s}}\label{eq-logistic-curve-shift}\tag{29.2.1} \end{equation}

Note that this curve is obtained from the standard logistic function

\begin{equation*} \sigma(x) = \frac{1}{1+e^{-x}} \end{equation*}

with

  • vertical stretching by factor \(M\text{;}\)
  • shift to the right by \(a\) units;
  • horizontal stretching by factor \(s\text{.}\)

(Reference and plot.)

Since the standard logistic function rises from 0 to 1 and has maximal slope when \(x=0\text{,}\) we can relate parameters \(a, M, s\) directly to shape of the curve:

  • \(a\) is the x-value at which the curve has the steepest slope;
  • \(M\) is the amount by which the curve rises;
  • \(s\) measures its horizontal size; the curve is essentially flat outside of the interval \([a-5s, a+5s]\text{.}\)

Fit a general logistic curve to the data in Example 28.2.1 without variable transformations.

Solution

Once we have the data x, y, we can set up the sum-of-squares function S according to (29.1.3). The parameters are to be represented by the components b(1), b(2), b(3) in some order (I use the order \(M, a, s\)). First, use a mindless guess for initial point, [0; 0; 1]. The orientation of data vectors x, y does not matter here since we no longer make a linear system out of them.

y = [1 1 1 5 8 10 27 44 62 93 101 110 112 115 116 117 118 120];
x = 1:numel(y);  

f = @(x, b) b(1)./(1 + exp(-(x-b(2))/b(3)));     % the model equation
opt = fminsearch(@(b) sum((y - f(x, b)).^2), [0; 0; 1]); % optimal (?) b

t = linspace(min(x), max(x), 1000)';
plot(t, f(t, opt), 'b', x, y, 'r*')
disp(opt)

The two middle lines describe the logic of the method: set up a model as a function of explanatory variable(s) x and parameter(s) b, then minimize the sum of squares of the deviations y - f(x, b). Finally, the optimal (?) values of parameter b are used to plot the best-fitting curve.

The above code fails, as is often the case with multivariable optimization. We should think of a better way to choose the initial point. For example, the first parameter \(M\) is the carrying capacity, and this is clearly at least as large as 120. The second parameter is the \(x\)-coordinate of its center which looks like 10. And the third parameter is about a tenth of the horizontal span of the non-flat part, maybe 2. An initial point like [120; 10; 2] does the job. Note that the fit is much better than what we achieved in Example 28.2.1.

The fminsearch optimization includes limits on the number of function evaluations, and on the number of iterations that it can do before stopping. If you get a warning that the process stopped because of these limits, try increasing them. For example,

options = optimset('MaxFunEvals', 10000, 'MaxIter', 10000);
opt = fminsearch(..., options);

The first line creates a “structure” (which would be called a “dictionary” in some languages) which records the options for the optimization algorithm.