Linear Algebra

Linear Combinations of Vectors

Here we scale and add to create what are known as linear combinations.

\[x \vec u + y \vec v= \vec b\]

We can generalize to arbitrary numbers of values by replacing the variables names we use. So instead of x and y, we could use $ c_1 $ and $ c_2 $ and instead of $ \vec u $ and $ \vec v $ we can use $ \vec v_1 $ and $ \vec v_2 $. This gives:

\[c_1 \vec v_1 + c_2 \vec v_2 + \ldots + c_n \vec v_n= \vec b\]

Sure it looks more jargony but it allows us to use the more compact summation notation:

\[\vec b = \sum_{i=1}^n c_i \vec v_i\]

Einstein, while working on the theory of general relativity, found himself writing $ \sum $ so often that he came up with what is now called the “Einstein summation convention” which allows you to drop the summation sign if the same index is found twice in the same term.

\[\vec b = c_i \vec v_i\]

This can make what would otherwise be very large equations more manageable but we won’t be needing it here.

A Different Perspective on Linear Systems from Vectors

We’ve all seen systems of linear equations in high school algebra like this one, where we have 3 equations and have to solve for the 3 unknowns.

\[\begin{align} x + 2y + 3z &= 14 \\ 4x + 5y + 6y &= 32 \\ 7x + 8y + 9z &= 50 \end{align}\]

It may surprise you to learn that this linear combination:

\[x \vec a + y \vec b + z \vec c = \vec d\]

and this system of linear equations:

\[\begin{align} a_1 x + b_1 y + c_1 z &= d_1 \\ a_2 x + b_2 y + c_2 y &= d_2 \\ a_3 x + b_3 y + c_3 z &= d_3 \end{align}\]

… are two different ways of describing the same system! Linear Combinations are fundamentally what linear algebra is all about.

For example, assume we have vectors: \(\vec a = (1,4,7), \vec b = (2,5,8), \vec c = (3,6,9)\) and \(\vec d = (14, 32, 50)\), then we can write:

\[\begin{align} x \vec a + y \vec b + z \vec c &= \vec d \\[10pt] x \begin{bmatrix} 1 \\ 4 \\ 7 \end{bmatrix} +y \begin{bmatrix} 2 \\ 5 \\ 8 \end{bmatrix} +z \begin{bmatrix} 3 \\ 6 \\ 9 \end{bmatrix} &= \begin{bmatrix} 14 \\ 32 \\ 50 \end{bmatrix} \\[10pt] \begin{bmatrix} x \\ 4 x \\ 7 x \end{bmatrix} + \begin{bmatrix} 2 y \\ 5 y \\ 8 y \end{bmatrix} + \begin{bmatrix} 3 z \\ 6 z \\ 9 z \end{bmatrix} &= \begin{bmatrix} 14 \\ 32 \\ 50 \end{bmatrix} \\[10pt] \begin{bmatrix} x + 2 y + 3 z \\ 4 x + 5 y + 6 z \\ 7 x + 8 y + 9 z \end{bmatrix} &= \begin{bmatrix} 14 \\ 32 \\ 50 \end{bmatrix} \end{align}\]

This is equivalent to the 3 equations we started with:

\[\begin{align} x + 2y + 3z &= 14 \\ 4x + 5y + 6y &= 32 \\ 7x + 8y + 9z &= 50 \end{align}\]

So we have two different perspectives we can view this system from:

  • column perspective - it’s a series of column vectors that are being scaled and added together
  • row perspective - it a series of linear equations, lines/planes/hyperplanes, that (may) intersect

Matrix Form

A third form, the matrix form, allows us to describe a linear system even more succinctly while at the same time giving us the advantage of having new tools to reason about the system. In the same way that a vector is just a group of related scalars, a matrix is really just a group of related vectors. (Later we’ll find that a tensor is just a group of related matrixes.) So, knowing this, we can concatenate the three column vectors $ \vec a $, $ \vec b $, and $ \vec c $ to produce a matrix,

\[M = \begin{bmatrix} \vec a \ \vec b \ \vec c \end{bmatrix} = \begin{bmatrix} \begin{pmatrix} 1 \\ 4 \\ 7 \end{pmatrix} \begin{pmatrix} 2 \\ 5 \\ 8 \end{pmatrix} \begin{pmatrix} 3 \\ 6 \\ 9 \end{pmatrix} \end{bmatrix} = \begin{bmatrix} 1 & 2 & 3 \\ 4 & 5 & 6 \\ 7 & 8 & 9 \end{bmatrix}\]

Now we can rewrite the linear system in a very simple form that exposes it’s essence:

\[M \vec x = \vec b\]

By expanding and performing the matrix/vector multiplication, we can recreate the form that we had above.

\[\begin{bmatrix} 1 & 2 & 3 \\ 4 & 5 & 6 \\ 7 & 8 & 9 \end{bmatrix} \begin{bmatrix} x \\ y \\ z \end{bmatrix} = \begin{bmatrix} x+2y+3z\\ 4x+5y+6z \\ 7+8y+9z \end{bmatrix} = \begin{bmatrix} 14 \\ 32 \\ 50 \end{bmatrix}\]

Solving the Matrix Form $ M \vec x = \vec b $

The notation $ M \vec x = \vec b $ suggests a way that we might solve for $ \vec x $. If these were simple scalar variables we could just divide both sides by $ M $ to get $ \vec x = \frac {1} {M} \vec b$ Could we do something similar for matrices and vectors? Unfortunately, matrices can’t be divided - they don’t have a divide operation but we could use something similar … multiply by the inverse! So what if we could do this?

\[\vec x = M^{-1} \vec b\]

Hmmm… that’s seems almost like cheating - in ordinary scalar numbers, $ \frac {1} {x} $ and $ x^{-1} $ are just two ways to write the same thing! So if we can figure out how to compute the inverse of a matrix, then we can multiply both sides by it:

\[\begin{align} M \vec x &= \vec b \\ (M^{-1} M) \vec x &= M^{-1}\vec b \\ I \vec x &= M^{-1}\vec b \\ \vec x &= M^{-1}\vec b \end{align}\]

Here we’re relying on $ M^{-1}M = 1$ as we can with scalars. Unfortunately, what works great for scalars doesn’t necessarily make sense for matrices: there would be a scalar 1 on the left and a matrix on the right. It’s apples and oranges. We need something for matrices that means roughly the same as $ 1 $ does for scalars. For scalars, we call “1” the “multiplicative identity” which is to say $ 1x=x $. For matrices, we will call this the “identity matrix”, $ I $. Multiplying a matrix or a vector by this identity matrix leaves the value unchanged, just like 1 does for scalars.

Identity matrix:

\[I = \begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{bmatrix} = M^{-1}M\]

So now we know that to solve the system of equations, we just need to find the inverse of the matrix, $ M $, and multiple by the vector $ \vec b $.

Finding the Inverse of a Matrix

To be continued