6.7. Under-determined Systems¶

As discussed in Systems of Linear Equations, under-determined systems of linear equations do not have a unique solution. If additional equations can not be found to make the system full rank, then one might try to find the set of points that satisfy the system of equations. Using Elimination, we can put the matrix into what is called reduced row echelon form or RREF, which will reveal the set of points where the system of equations is satisfied.

The word echelon just means resembling stair steps. A matrix is in row echelon form when:

  • All nonzero values in each column are above all zeros.
  • The leading coefficient of each row is strictly to the right of the leading coefficient in the row above it.

So when we perform Gauss-Jordan Elimination on an augmented matrix to make an upper-triangular matrix, we are putting the matrix into row echelon form.

An additional requirement for reduced row echelon form is:

  • Every leading coefficient must be 1, and must be the only nonzero in its column. In RREF, the first m (number of rows) columns of the matrix should look as close as is possible to an identity matrix.

Thus a full rank matrix system in augmented form has the identity matrix in the first m columns when in RREF. The last column is the solution to the system of equations.

>> Ab = [A b] Ab =     8     1     6    27     3     5     7    38     4     9     2    55 >> rref(Ab) ans =     1     0     0     2     0     1     0     5     0     0     1     1          

Now for an under-determined system:

>> A = randi(10, 3, 4) - randi(5, 3, 4) A =     6     0     4     0     1     1     9     8     5     9     5     5 >> rank(A) ans =     3 >> x = randi(5, 4, 1) x =     4     2     5     1 >> b = A*x b =     44     59     68 >> C = rref([A b]) C =     1.0000         0         0   -0.6091    3.3909          0    1.0000         0    0.3864    2.3864          0         0    1.0000    0.9136    5.9136          

Notice that the first three columns, for x_1, x_2, and x_3, each have only a single one. But the fourth column, for x_4, has values in each row. MATLAB found the fourth column to be a linear combination of the first three columns by the weights indicated. So the values of the first three variables, which are pivot columns, are fixed by the equation. We call the variables associated with the pivot columns basic variables. The non-pivot column variable, x_4, is called a free variable, meaning that it can take any value.

We don't have an equation for x_4 since it is a free variable, so we can just say that x_4 = x_4. We can also replace it with an independent scalar variable, a.

The new system of equations is:

\begin{array}{rl}    x_1 - 0.6091\,x_4 &= 3.3909 \\    x_2 + 0.3864\,x_4 &= 2.3864 \\    x_3 + 0.9136\,x_4 &= 5.9136 \\    x_4 &= x_4  \end{array}

Then solving for the unknown \bf{x} vector,

\begin{array}{rl}       x_1 &= 3.3909 + 0.6091\,a \\       x_2 &= 2.3864 - 0.3864\,a \\       x_3 &= 5.9136 - 0.9136\,a \\       x_4 &= x_4.  \end{array}

The set of solutions is a line. We can see this if we write the line equation in what is called parametric form.

\vector{x_1; x_2; x_3; x_4} =  \vector{3.3909; 2.3864; 5.9136; 0}  + a\,\vector{0.6091; -0.3864; -0.9136; 1}

Note

The solution to \mathbf{A}\bm{x} = \bm{b} is a combination of two solutions, a general solution, \bm{u} = \text{Null}(\mathbf{A}), and a particular solution, which is any vector that solves \mathbf{A}\bm{v} = \bm{b}. Let \bm{x} = \bm{u} + \bm{v}.

\mathbf{A}\bm{x} = \mathbf{A}\,(\bm{u} + \bm{v})  = \bm{0} + \bm{b} = \bm{b}

The general solution are the vectors that can be multiplied by any scalar. These vectors also form the basis vectors for the null space. The particular solution is the constant vector in the solution.

See Null Space.

Here is an example showing the null space solution and one point from the set of particular solutions.

>> A A =     3     3     4    -2     6     4    -3    -2     3     0     2     5 >> rank(A) ans =     3 >> u = null(A) u =    -0.4783     0.8120    -0.0890     0.3226 >> A*u ans =    1.0e-14 *          0    -0.1998    -0.0222  >> b b =     10      2     10 >> Z = rref([A b]) ans =     1.0000         0         0    1.4828    2.0920          0    1.0000         0   -2.5172   -1.2414          0         0    1.0000    0.2759    1.8621  % For the particular solution, let a = 1. % We need another row in Z to show the particular solution. >> Z(4,:) = zeros(1, 5); >> Z(4,4) = -1 Z =     1.0000         0         0    1.4828    2.0920          0    1.0000         0   -2.5172   -1.2414          0         0    1.0000    0.2759    1.8621          0         0         0   -1.0000         0  >> v = Z(:,5) - Z(:,4) v =     0.6092     1.2759     1.5862     1.0000 >> A*v ans =    10.0000     2.0000    10.0000  % Now for A(u + v) >> A*(u + v) ans =    10.0000     2.0000    10.0000          

Here is an example with two free variables. Notice that in this case, the last row becomes all zeros, so it is not part of the solution. Here we begin with the augmented matrix.

>> Ab = [A b] Ab =     4     2     6    10     3    12     8    -9    -1     7     5   -13     2     3     5     7     4     7    -2     5     3     1     3     3 >> C = rref(Ab) C =     1     0     1     2     0     3     0     1     1     1     0     3     0     0     0     0     1    -2     0     0     0     0     0     0          

\begin{array}{rcl}       x_1 &= & 3 - x_3  - 2\,x_4 \\       x_2 &= & 3 - x_3  - x_4 \\       x_3 &= & x_3 \\       x_4 &= & x_4 \\       x_5 &= & -2  \end{array}

The solution is a hyperplane with two degrees of freedom in x_3 = a and x_4 = b, which expressed in a parametric equation is:

\vector{x_1; x_2; x_3; x_4; x_5} =  \vector{3; 3; 0; 0; -2}  + a\,\vector{-1; -1; 1; 0; 0}  + b\,\vector{-2; -1; 0; 1; 0}

We can see a plot of how the dependent variables relate to each other:

[a, b] = meshgrid(linspace(-20,20), linspace(-20,20)); X1 = 3 - a - 2*b; X2 = 3 - a - b; X5 = -2*ones(100);  subplot(2, 2, 1) surf(a, b, X1) xlabel('a') ylabel('b') zlabel('X_1')  subplot(2, 2, 2) surf(a, b, X2) xlabel('a') ylabel('b') zlabel('X_2')  subplot(2, 2, 3) surf(a, b, X5) xlabel('a') ylabel('b') zlabel('X_5')  subplot(2, 2, 4) surf(X5, X1, X2) xlabel('X_5') ylabel('X_1') zlabel('X_2')