Skip to main content

Section 6.2 Systems without free variables

What do we get when solving a system with A\b in Matlab? Let us first consider the case when rank(A) is n, so every column of A is a basic column, which means there are no free variables. When such a system is consistent, it has a unique solution and A\b finds this unique solution. When the system is inconsistent, we get the least-squares solution: the vector \(x\) for which the norm of the residual vector \(Ax-b\) is as small as possible. Recall that the norm of a vector \(v = (v_1, \dots, v_m)\) is

\begin{equation*} \|v\| = \sqrt{v_1^2+\cdots +v_m^2} \end{equation*}

In Matlab this norm can be computed in several ways:

  • sqrt(sum(v.^2))
  • sqrt(v'*v) when v is a column vector
  • sqrt(v*v') when v is a row vector
  • sqrt(dot(v, v))
  • norm(v)

The last way is easiest.

Find the least-squares solution \(x\) and the norm of residual vector \(Ax-b\) for the system with the matrix

\begin{equation*} A = \begin{pmatrix} 1 \amp 2 \\ 4 \amp 3 \\ 5\amp 6 \end{pmatrix} \end{equation*}

and with

\begin{equation*} b = \begin{pmatrix} 7 \\ 8 \\ 9\end{pmatrix} \end{equation*}
Answer
A = [1 2; 4 3; 5 6];
b = [7 8 9]';
x = A\b;
disp(x)
disp(norm(A*x-b))

It may be informative to add the line disp(A*x) and compare this vector to b.

A practical use of least squares solutions is linear regression. Suppose we are given some data points \((x_k, y_k)\) and want to find the line \(y = cx+d\) that fits these points the best. The idea is to imagine we could have perfect fit:

\begin{align*} c x_1 + d \amp = y_1 \\ c x_2 + d \amp = y_2 \\ \cdots \amp = \cdots \\ c x_m + d \amp = y_m \end{align*}

This system can be written as \(Az=b\) where

\begin{equation*} A = \begin{pmatrix} x_1 \amp 1 \\ x_2 \amp 1 \\ \vdots \amp \vdots \\ x_m \amp 1 \end{pmatrix}, \quad b = \begin{pmatrix} y_1 \\ y_2 \\ \vdots \\ y_m \end{pmatrix}, \quad z = \begin{pmatrix} c \\ d \end{pmatrix} \end{equation*}

This is an overdetermined system when we have more than 2 data points. Unless all \(x\)-values are the same (which should not happen), the rank of A is 2 and so the system has no free variables. This means that A\b will find its least squares solution: a vector with components \((c, d)\) for which the sum of squares of the residuals \(cx_k+d - y_k\) is as small as possible. This is the line of best least-squares fit, and this process is known as linear regression in statistics.

Find the line of best fit (in the sense of least squares) to the six points with x-values (-1, 0, 2, 3, 4, 7) and y-values (9, 8, 5, 5, 3, -2). Plot both the line and the points.

Answer
x = [-1 0 2 3 4 7]'; 
y = [9 8 5 5 3 1]';
A = [x x.^0];
z = A\y;
xx = linspace(min(x), max(x), 500)';
yy = [xx xx.^0]*z;
plot(xx, yy, 'b', x, y, 'r*')

Explanation: think of \(cx+d\) as \(cx + dx^0\) because this naturally leads to the matrix of this linear system: its columns are x and x.^0. The plot of given data points created by plot(x, y, 'r*') consists of red asterisks, not joined by a line. But to plot the line of best fit (or a curve of best fit, in general) we use more points on the x-axis. These points are in the vector xx: they are evenly distributed over the same interval where the given data points lie. The way in which y-coordinates of this line are computed may look strange. The matrix product [xx xx.^0]*z is logically the same as z(1)*xx + z(2): multiply x-values by the coefficient \(c\) and add the constant term \(d\text{.}\) Writing it in matrix form allows for easier generalization later. Note that the line yy = [xx xx.^0]*z, which uses the solution of a linear system, is consistent with A = [x x.^0] which created that system.