Skip to main content

Section 6.5 Examples and questions

These are additional examples for reviewing the topic we have covered. When reading each example, try to find your own solution before clicking “Answer”. There are also questions for reviewing the concepts of this section.

Using the data of Example 6.2.2, find and plot the parabola \(y = cx^2+dx+e\) that fits the data best in the sense of least squares.

Answer
x = [-1, 0, 2, 3, 4, 7]'; 
y = [9, 8, 5, 5, 3, 1]';
A = [x.^2, x, x.^0];
z = A\y;
xx = linspace(min(x), max(x), 500)';
yy = [xx.^2, xx, xx.^0]*z;
plot(xx, yy, 'b', x, y, 'r*')

Explanation: the only modification that we need is to add a column with the squares of x-values. In general, this process allows us to fit a function of the form \(c_1 f_1(x) + c_2 f_2(x) + \cdots + c_n f_n(x) \) where \(f_1, f_2, \dots, f_n\) are given functions and \(c_1, c_2, \dots, c_n\) are coefficients (parameters) to be determined.

Modify Example 6.3.2 by replacing 1 with 1.000001. How does the solution change? Repeat, this time replacing 9 by 9.000001. How did the solution change now?

Solution

A = [1.000001 2 3; 4 5 6; 7 8 9];
b = [10; 11; 12];
leads to A\b being [0.0000; -9.0000; 9.3333]. In contrast, with
A = [1 2 3; 4 5 6; 7 8 9.000001];
we get the solution [-9.3333; 9.6667; 0.0000] which is quite different. Note that Matlab shows no warnings in either case. These systems do not have a free variable: the rank of A is 3. Yet, the solution is very sensitive to small changes of the system, because the matrix is close to being singular (or degenerate).

The simplest example of an inconsistent linear system is a system of two equations with one variable: \(x = b_1\) and \(x = b_2\) where \(b_1, b_2\) are two unequal real numbers. What is the least-squares solution of this system, and why? (Try to answer without Matlab, and then check the conclusion with Matlab.)

Linear algebra says that an \(n\times n\) matrix \(A\) is singular (non-invertible) if and only if \(\det A = 0\text{,}\) if and only if \(\operatorname{rank} A \lt n\text{.}\) But in computational practice these two tests for invertibility can give different results. Consider the diagonal matrix with entries \(10^{16}\) and \(1\text{.}\)

A = [1e16 0; 0 1];
disp(det(A))
disp(rank(A))

Matlab says the determinant is \(10^{16}\) which makes sense: it is the product of diagonal entries. Yet, it also says the rank is \(1\text{.}\) Why?

How different is the best-fit parabola in Example 6.5.1 from the line of best fit? If you have to choose between fitting a line or a parabola to this data set, what would you use?

Later we will consider the issue of model selection systematically: how to decide which model is more appropriate (for example, linear or quadratic).