For example,

Matrix addition has the following properties:

o Commutativity of matrix addition. The order of addition does not m a t t e r . A + B : B + A .

o Associativity of matrix addition. The grouping of matrices during addit ion does not matter. (A+ B) * C : A+ (B+ C) .

o Existence of an identity element for matrix addition. There exists a zero matrix, having all 0 entries and simply denoted as O, which is the identity element. For any matrix A, A + 0 : A.

o Existence of additive inverses for matrix addition. For every matrix A there is a matix -A such that A + (-A) : 0. The entries of -A are -aii.

As with vectors, we can also multiply a matrix by a scalar.

A.2 Matrices 693

Definition A.4 (Scalar Multiplication of a Matrix). The product of a scalar a and a matrix A is the matrix B : aA, whose entries are given by the following equation.

b6i : a aii (A.7)

Scalar multiplication has properties that are very similar to those of mul- tiplying a vector by a scalar.

o Associativity of scalar multiplication. The order of multiplication

by two scalars does not matter. a(pe) : @0)A.

o Distributivity of scalar addition over multiplication of a scalar by a rnatrix. Adding two scalars and then multiplying the sum by a matrix is the same as multiplying each scalar times the matrix and then

adding the two resultant matrices. (a+ 0)A: aA + BA .

o Distributivity of scalar multiplication over matrix addition. Adding two matrices and then multiplying the sum by a scalar is the

same as multiplying each matrix by the scalar and then adding. a(A + B ) : a A i a B .

o Existence of scalar identity. If a:1, then for any matrix A, aA :

A .

None of the previous properties should be surprising since we can think

of a matrix as being composed of row or column vectors, and hence, matrix

addition or the multiplication by a scalar amounts to adding corresponding row or column vectors or multiplying them by a scalar.

A.2.3 Matrices: Multiplication

We can define a multiplication operation for matrices. We begin by defining

multiplication between a matrix and a vector.

Definition A.5 (Multiplication of a Matrix by a Column Vector). The product of an m by n matrix A and an n by 1 column matrix u is the rn by

1 column matrix v: Au, whose entries are given by the following equation.

u i : a i . * ‘ l ) T (A.8)

In other words, we take the dot product of the transpose of u with each row vector of A. In the following example, notice that the number of rows in

694 Appendix A Linear Algebra

u must be the same as the number of columns of A.

We can similarly define the multiplication of a matrix by a row vector on the left side.

Definition ,4′.6 (Multiplication of a Matrix by a Row Vector). The product of a 1 by m row matrix u and an m by n matrix A is the 1 by n row matrix v: uA, whose entries are given by the following equation.

? ) i : t ‘ ( . . r ) t (A.e)

In other words, we take the dot product of the row vector with the trans- pose of each column vector of A. An example is given below.

[o 22]

We define the product of two matrices as an extension to the above idea.

Definition A.7. The product of an m by n matrix A and an n by p matrix B is the rn by p matrix C : AB, whose entries are given by the equation

c i j : a1* . (b – ; ) t (A.10)

In words, the i,jth entry of C is the dot product of the i,th row vector of A and the transpose of the jth column vector of B.

Matrix multiplication has the following properties:

o Associativity of matrix rnultiplication. The order of multiplication of matrices does not matter. (AB)C : A(BC).

o Distributivity of matrix multiplication. Matrix multiplication is distributive with respect to matrix addition. A(B + C) : AB + AC and (B + C)A : BA + CA.

[e r] lrl frzl l t t t : t I

Ll 2)L2l Lel

rr 4li 3l :

fr rl fs 41 ltz 2r1 t t t t : t l

Ll 2)12 e l Le 22)

4.2 Matrices 695

o Existence of an identity element for matrix multiplication. If I, is the p by p matrix with 1’s only on the diagonal and 0 elsewhere, then for aty m by n matrix A, AI,,: A and l*A: A. (Note that the identity matrix is an example of a diagonal matrix, which is a matrix whose off diagonal entries are all 0, i.e., &.i,j :0, if i + i.)

In general, matrix multiplication is not commutative, i.e., AB I BA.

A.2.4 Linear TYansformations and Inverse Matrices

If we have an n by 1 column vector u, then we can view the multiplication

of an m by n matrix A by this vector on the right as a transformation of u into an rn-dimensional column vector v : Au. Similarly, if we multiply A

by a (row) vector u : [rr, .. .,u*] on the left, then we can view this as a transformation of u into an n-dimensional row vector v : uA. Thus, we can view any m by n matrix A as a function that maps one vector space onto

another. In many cases, the transformation (matrix) can be described in easily un-

derstood terms.

o A scaling matrix leaves the direction of the vector unchanged, but

changes its length. This is equivalent to multiplying by a matrix that is

the identity matrix multiplied by a scalar.

o A rotation matrix changes the direction of a vector, but leaves the magnitude of the vector unchanged. This amounts to a change of coor- dinate system.

o A reflection matrix reflects a vector across one or more coordinate axes. This would be equivalent to multiplying some of the entries of the

vector by -1, while leaving the other entries unchanged.

o A projection matrix takes vectors into a lower dimensional subspace. The simplest example is the modified identity matrix where one or more

of the 1’s on the diagonal have been changed into 0’s. Such a matrix eliminates the vector components corresponding to those zero entries, while preserving all others.

Of course, a single matrix can do two kinds of transformations at once, e.g.,

scaling and rotation. Following are a few properties of matrices when viewed as functions that

map vectors from one vector space to another.

696 Appendix A Linear Algebra

Matrices are linear transformations, i.e., A(au + 0v): aAu -f /Av and (ou + pv)A : auA + |vA.

The set of all transformed row vectors of a matrix A is called the row space of A because the row vectors of the matrix, or some subset of them, form a basis for the subspace of transformed row vectors. This is evident from the following equation, which expresses the product of a 1 by m row vector r : [ rr , . . . ,urnl and an mby n matr ix A as a l inear combination of the rows of the matrix.

The dimension of the row space tells us the number of linearly indepen- dent rows of A.

o The set of all transformed column vectors is called the column space of A. The column vectors of the matrix, or some subset of them, form a basis for the subspace of transformed column vectors. This is clear from the following equation, which expresses the product of an n by I co lumn vec tor r : l r r , . . . ,un ]T and an mby n mat r ix A as a l inear combination of the columns of the matrix.

TIL

^ \-a V: 11A : )

, t L ; . d ; . * ZJ

TL ^ S –

v : A u : ) . u j a * j j : r

(A.11)

(A.12)

The dimension of the column space tells us the number of linearly inde- pendent columns of A.

o The left nullspace is the set of row vectors that the matrix maps to 0.

o The right nullspace (or more commonly, just nullspace) is the set of column vectors that the matrix maps to 0.

Note that the rank of a matrix is the minimum of the dimensionality of the row space and column space and is often used to characterize a matrix. For instance, if we take a single 7 by n row vector and duplicate it m times to create an m by rz matrix, we would only have a matrix of rank 1.

A question of practical and theoretical importance is whether matrices, like real numbers, have multiplicative inverses. First, we note that because of the nature of matrix multiplication (i.e., dimensions have to match), a matrix

4.2 Matrices 697

must be square if it is to have an inverse matrix. Thus, for an mby m matrix A, we are asking if we can find a matrix A-1 such that AA-1 : A-1A – l*. The answer is that some square matrices have inverses and some do not.

More abstractly, an m by rn matrix has an inverse only if both of its null spaces contain only the 0 vector, or if, equivalently, the row and column spaces are both of dimension rn. (This is equivalent to the rank of the matrix being nr,.) Conceptually, an m by m matrix has an inverse if and only if it uniquely maps every non-zero m-dimensional row (column) vector onto a unique, non- zero m-d\mensional row (column) vector.

The existence of an inverse matrix is important when solving various matrix equations.

A.2.5 Eigenvalue and Singular Value Decomposition

We now discuss a very important area of linear algebra: eigenvalues and eigen- vectors. Eigenvalues and eigenvectors, along with the related concept ofsingu- lar values and singular vectors, capture the structure of matrices by allowing us to factor or decompose matrices and express them in a standard format. For that reason, these concepts are useful in the solution of mathematical equa- tions and for dimensionality and noise reduction. We begin with the definition of eigenvalues and eigenvectors.

Definition A.8 (Eigenvectors and Eigenvalues). The eigenvalues and eigenvectors of an m by n matrix A are, respectively, the scalar values ,\ and the vectors u that are solutions to the following equation.

Au : ) u (A.13)

In other words, eigenvectors are the vectors that are unchanged, except for magnitude, when multiplied by A. The eigenvalues are the scaling fac- tors. This equation can also be written as (A – )I)u : 0.

For square matrices, it is possible to decompose the matrix using eigenval- ues and eigenvectors.

Theorem A.l. Assume that A i,s an n by n matrir wi.th n i,ndependent (or-

thogonal) e ‘ igenuectors, 1f ,r , . . . ,un andn correspondi,ng ei ,genualues, ) ,1, . . . , \n. LetIJ be the matrir whose colurnns are these e’igenuectors, ‘i.e., U : lur, . . . , u”,] and let lt be a d’iagonal matrir, whose diagonal entries are the \;, I I i I n. Then A can be erpressed as

A: UAU- I (A.14)

698 Appendix A Linear Algebra

Thus, A can be decomposed into a product of three matrices. u is known as the eigenvector matrix and A as the eigenvalue matrix.

More generally, an arbitrary matrix can be decomposed in a similar way. Specifically) any rn by n matrix A can be factored into the product of three matrices as described by the following theorem.

Theorem 4.2. Assume that A i,s an m by n matrir. Then A can be erpressed as follows

A: UEV” . (A.15)

WhereU i,s m by m, D is m by n, andY i,s n by n. U andY are orthonormal matrices, 2.e., thei,r columns are of uni,t length and are mutually orthogonal. Thus, IJIJT :lm and, VV” : I”,. E i,s a d,iagonal matrir whose di,agonal entries are non-negat’iue and are sorted so that the larger entries appear fi,rst, ‘ i .e . , o i , , i ) o* t , r+ t

The column vectors of V, v1 , . . . ,yn are the right singular vectors, while the columns of U are the left singular vectors. The diagonal elements of E, the singular value matrix, are typically written as or)…,on and are called the singular values of A. (This use of o should not be confused with the use of o to represent the standard deviation of a variable.) There are at most rank(A) < min(m,n) non-zero singular values.

It can be shown that the eigenvectors of A”A are the right singular vectors (i.e., the columns of V), while the eigenvectors of AAT are the left singular vectors (i.e., the columns of U). The non-zero eigenvalues of A”A and AA” are the o|, i.e., the squares of the singular values. Indeed, the eigenvalue decomposition of a square matrix can be regarded as a special case of singular value decomposition.