Idea of Broyden's method

Section 11.1 Idea of Broyden's method

The basic idea is: since we do not know the derivative, we are going to guess its value and then improve our guess at every step of the algorithm.

Let us return to one-variable situation first. Newton's method proceeds in steps computed as \(h = -f(x_0)/f'(x_0)\) which requires the knowledge of \(f'(x_0)\text{.}\) Suppose we do not know the derivative but are willing to guess. For notational convenience, let us say we are guessing the value of \(1/f'(x_0)\text{.}\) We want some non-zero number. Knowing nothing whatsoever, we can guess "1". Let us call this guess \(b_0 = 1\text{.}\) Armed with this number, we proceed as usual in Newton's method: make the step \(h = -b_0 f(x_0)\) arriving at the point \(x_1 = x_0\text{.}\)

Our next guess, for the value of \(1/f'(x_1)\) can be better educated: we can think of the secant line through the two points \((x_0, f(x_0))\) and \((x_1, f(x_1))\text{,}\) and decide it would be logical for \(b_1\) to be the reciprocal of the slope of this secant line. This means we pick \(b_1\) so that

\begin{equation} b_1(f(x_1)- f(x_0)) = x_1-x_0 \label{eq-broyden-1}\tag{11.1.1} \end{equation}

Then the process continues: we make the step \(h = -b_1 f(x_1)\text{,}\) arrive at \(x_2 = x_1+h\) and use the secant slope through the two newest points,

\begin{equation*} b_2(f(x_2)- f(x_1)) = x_2-x_1 \end{equation*}

Apart from the initial uneducated guess for \(b_0\) this is exactly the secant method, expressed in different notation.

In higher dimensions, we need a guess for the inverse of the Jacobian matrix, which is used in Newton's method \(\mathbf h = -J^{-1} \mathbf F(\mathbf x)\text{.}\) At first we take an uneducated guess for the value of \(J(x_0)^{-1}\text{,}\) namely \(B_0=I\text{,}\) the identity matrix. This enables us to make a step \(\mathbf h = -B_0 F(\mathbf x_0)\text{.}\) Now comes the important part we want our next guess to be better informed than the previous one. Looking back at (11.1.1), it seems we should choose a matrix \(B_1\) so that

\begin{equation*} B_1(\mathbf F(\mathbf x_1) - \mathbf F(\mathbf x_0)) = \mathbf x_1 - \mathbf x_0 \end{equation*}

How to do that? We cannot “divide a vector by another”, the formula

\begin{equation*} B_1 \overset{?}{=} \frac {\mathbf x_1 - \mathbf x_0}{\mathbf F(\mathbf x_1) - \mathbf F(\mathbf x_0)} \end{equation*}

makes no sense (this is the difference between one variable and several variables.) We need to use linear algebra.