Lagrange interpolating polynomial

Section 21.2 Lagrange interpolating polynomial

Thinking of the set of all polynomials of degree less than \(n\) is a vector space of dimension \(n\text{,}\) we should realize this space may have a better basis for our task than \(x^0, x^1, \dots, x^{n-1}\text{.}\) Given interpolation data \((x_k, y_k)\) with \(k=1, \dots, n\text{,}\) we can construct one such basis as follows. The Lagrange basis polynomials are

\begin{equation} \ell_k (x) = \prod_{j\ne k} \frac{x-x_j}{x_k-x_j} \label{eq-lagrange-basis}\tag{21.2.1} \end{equation}

They are constructed so that \(\ell_k(x_j)=0\) when \(j\ne k\) and \(\ell_k(x_k)=1\text{.}\) Consequently, in this basis the interpolation problem is solved with the Lagrange interpolating polynomial

\begin{equation} p(x) = \sum_{k=1}^n y_k\ell_k(x) \label{eq-lagrange-poly}\tag{21.2.2} \end{equation}

which obviously satisfies \(p(x_k)=y_k\text{.}\) The coefficients of interpolating polynomial in this basis are just the given \(y\)-values.

Question 21.2.1. Sum of Lagrange basis polynomials.

What is the sum of all Lagrange basis polynomials, that is, \(\ell_1+\ell_2+\cdots+\ell_n\text{?}\)

For the sake of efficiency and numerical stability, the evaluation of \(p\) can be arranged as follows. Since the denominators in (21.2.1) do not involve \(x\text{,}\) they can be computed in advance. Introduce “barycentric weights”

\begin{equation} w_k = \prod_{j\ne k} (x_k-x_j)^{-1} \label{eq-barycentric-weights}\tag{21.2.3} \end{equation}

which are related to Lagrange basis polynomials as \(\ell_k(x) = w_k \prod_{j\ne k} (x-x_j)\text{.}\) Rewrite (21.2.2) as a quotient

\begin{equation} p(x) = \frac{\sum_{k=1}^n y_k\ell_k(x)}{\sum_{k=1}^n \ell_k(x)} = \frac{\sum_{k=1}^n y_k w_k \prod_{j\ne k} (x-x_j) }{\sum_{k=1}^n w_k \prod_{j\ne k} (x-x_j)}\label{eq-lagrange-poly-as-quotient}\tag{21.2.4} \end{equation}

Finally, divide the numerator and denominator by the product of all \((x-x_j)\) to conclude that the Lagrange interpolating polynomial \(p\) can be computed as:

\begin{equation} p(x) = \frac{\sum_{k=1}^n y_k w_k(x-x_k)^{-1}}{ \sum_{k=1}^n w_k (x-x_k)^{-1} } \label{eq-lagrange-poly-barycentric}\tag{21.2.5} \end{equation}

The idea of computing a polynomial as the quotient of two rational functions is counterintuitive. But the formula (21.2.5) wins over the original formula (21.2.2) in terms of numerical stability and computational speed, because it avoids a large number of multiplications involving \((x-x_j)\) for every point \(x\text{.}\) A slight disadvantage of (21.2.5) is that one cannot plug in \(x=x_k\) into (21.2.5), but we know that \(p(x_k)=y_k\) anyway.

Example 21.2.2. Plot the Lagrange polynomial through 10 points.

Plot the Lagrange polynomial each of the following data sets:

the points \((k, \sin k)\text{,}\) \(k=1, \dots, 10\text{;}\)
the points with \(x_k\) as above, and \(y_k\) chosen randomly from the standard normal distribution.

Answer

n = 10;
x = 1:n;
y = sin(x);  % or y = randn(1, n);
w = ones(1, n);
for k = 1:n
    for j = 1:n
        if j ~= k
            w(k) = w(k)/(x(k)-x(j));
        end
    end
end

p = @(t) (y.*w*(t - x').^(-1)) ./ (w*(t - x').^(-1));
t = linspace(min(x)-0.01, max(x)+0.01, 1000);
plot(t, p(t), x, y, 'r*')

The double loop computes the weights w which do not involve the argument of the interpolating polynomial p. This argument is called t because x is already used for the data points. The interval for plotting is chosen to contain the given x-values with a small margin on both sides: this helps both visualization and evaluation, because we avoid plugging in \(t = x_1, x_n\text{.}\) The barycentric evaluation of the polynomial is vectorized. When t is a scalar, t - x' is a column vector. Taking reciprocals element-wise gives another column vector. Dot-product with w or y.*w implements the sums such as \(\sum_{k=1}^n w_k/(t-x_k)\text{.}\)

How does p accept a row vector as t? The expression t - x' is the difference of a row vector and a column vector; in Matlab this creates a matrix where \((i, j)\) entry is \(t_j - x_i\text{.}\) Subsequent operations proceed as above, and the result is a row vector of values \(p(t)\text{.}\)