Math 1600 Lecture 31

$ \newcommand{\bdmat}[1]{\left|\begin{array}{#1}} \newcommand{\edmat}{\end{array}\right|} \newcommand{\bmat}[1]{\left[\begin{array}{#1}} \newcommand{\emat}{\end{array}\right]} \newcommand{\coll}[2]{\bmat{r} #1 \\ #2 \emat} \newcommand{\ccoll}[2]{\bmat{c} #1 \\ #2 \emat} \newcommand{\colll}[3]{\bmat{r} #1 \\ #2 \\ #3 \emat} \newcommand{\ccolll}[3]{\bmat{c} #1 \\ #2 \\ #3 \emat} \newcommand{\collll}[4]{\bmat{r} #1 \\ #2 \\ #3 \\ #4 \emat} \newcommand{\ccollll}[4]{\bmat{c} #1 \\ #2 \\ #3 \\ #4 \emat} \newcommand{\colllll}[5]{\bmat{r} #1 \\ #2 \\ #3 \\ #4 \\ #5 \emat} \newcommand{\ccolllll}[5]{\bmat{c} #1 \\ #2 \\ #3 \\ #4 \\ #5 \emat} \newcommand{\red}[1]{{\color{red}#1}} \newcommand{\lra}[1]{\mbox{$\xrightarrow{#1}$}} \newcommand{\rank}{\textrm{rank}} \newcommand{\row}{\textrm{row}} \newcommand{\col}{\textrm{col}} \newcommand{\null}{\textrm{null}} \newcommand{\nullity}{\textrm{nullity}} \renewcommand{\Re}{\operatorname{Re}} \renewcommand{\Im}{\operatorname{Im}} \renewcommand{\Arg}{\operatorname{Arg}} \renewcommand{\arg}{\operatorname{arg}} \newcommand{\adj}{\textrm{adj}} \newcommand{\mystack}[2]{\genfrac{}{}{0}{0}{#1}{#2}} \newcommand{\mystackthree}[3]{\mystack{\mystack{#1}{#2}}{#3}} \newcommand{\qimplies}{\quad\implies\quad} \newcommand{\qtext}[1]{\quad\text{#1}\quad} \newcommand{\qqtext}[1]{\qquad\text{#1}\qquad} \newcommand{\svec}[1]{\,\vec{#1}} \newcommand{\query}[1]{\toggle{\text{?}\vphantom{#1}}{#1}\endtoggle} \newcommand{\smallquery}[1]{\toggle{\text{?}}{#1}\endtoggle} $

Announcements:

Today we finish 4.4 and cover the Markov chains part of Section 4.6. Not covering Section 4.5, or rest of 4.6 (which contains many interesting applications!) Read Section 5.1 for Monday. Work through recommended homework questions. Some updated!

Tutorials: Quiz 6 is next week. It covers Sections 4.4 and the Markov Chains part of 4.6.
Office hour: Monday, 1:30-2:30, MC103B.
Help Centers: Monday-Friday 2:30-6:30 in MC 106.

Review of Section 4.4

Definition: Let $A$ and $B$ be $n \times n$ matrices. We say that $A$ is similar to $B$ if there is an invertible matrix $P$ such that $P^{-1} A P = B$. When this is the case, we write $A \sim B$.

Theorem 4.22: Let $A$ and $B$ be similar matrices. Then:
a. $\det A = \det B$
b. $A$ is invertible iff $B$ is invertible.
c. $A$ and $B$ have the same rank.
d. $A$ and $B$ have the same characteristic polynomial.
e. $A$ and $B$ have the same eigenvalues.

Definition: $A$ is diagonalizable if it is similar to some diagonal matrix.

If $A$ is similar to a diagonal matrix $D$, then $D$ must have the eigenvalues of $A$ on the diagonal. But how to find $P$?

Theorem 4.23: Let $A$ be an $n \times n$ matrix. Then $A$ is diagonalizable if and only if $A$ has $n$ linearly independent eigenvectors.

More precisely, there exist an invertible matrix $P$ and a diagonal matrix $D$ with $P^{-1}AP = D$ if and only if the columns of $P$ are $n$ linearly independent eigenvectors of $A$ and the diagonal entries of $D$ are the corresponding eigenvalues in the same order.

This theorem is one of the main reasons we want to be able to find eigenvectors of a matrix. Moreover, the more eigenvectors the better, so this motivates allowing complex eigenvectors.

Theorem 4.24: If $\lambda_1, \ldots, \lambda_k$ are distinct eigenvalues of $A$ and, for each $i$, $\cB_i$ is a basis for the eigenspace $E_{\lambda_i}$, then the union of the $\cB_i$'s is a linearly independent set.

Theorem: An $n \times n$ matrix is diagonalizable if and only if the sum of the geometric multiplicities of the eigenvalues is $n$.

Theorem 4.25: If $A$ in an $n \times n$ matrix with $n$ distinct eigenvalues, then $A$ is diagonalizable.

So it is important to understand the geometric multiplicities better. Here is a helpful result:

Lemma 4.26: If $\lambda_1$ is an eigenvalue of an $n \times n$ matrix $A$, then $$ \text{geometric multiplicity of } \lambda_1 \leqslant \text{algebraic multiplicity of } \lambda_1 $$

Let $A$ be an $n \times n$ matrix with distinct eigenvalues $\lambda_1, \lambda_2, \ldots, \lambda_k$. Let their geometric multiplicities be $g_1, g_2, \ldots, g_k$ and their algebraic multiplicities be $a_1, a_2, \ldots, a_k$. We know $$ g_i \leqslant a_i \qtext{for each $i$} $$ and so $$ g_1 + \cdots + g_k \leqslant a_1 + \cdots + a_k \leqslant n $$ So the only way to have $g_1 + \cdots + g_k = n$ is to have $g_i = a_i$ for each $i$ and $a_1 + \cdots + a_k = n$.

Theorem 4.27 (The Diagonalization Theorem): Let $A$ be an $n \times n$ matrix with distinct eigenvalues $\lambda_1, \lambda_2, \ldots, \lambda_k$. Let their geometric multiplicities be $g_1, g_2, \ldots, g_k$ and their algebraic multiplicities be $a_1, a_2, \ldots, a_k$. Then the following are equivalent:
a. $A$ is diagonalizable.
b. $g_1 + \cdots + g_k = n$.
c. $g_i = a_i$ for each $i$ and $a_1 + \cdots + a_k = n$.

Note: This is stated incorrectly in the text. The red part must be added unless you are working over $\C$, in which case it is automatic that $a_1 + \cdots + a_k = n$. With the way I have stated it, it is correct over $\R$ or over $\C$.

New material

Lemma 4.26: If $\lambda_1$ is an eigenvalue of an $n \times n$ matrix $A$, then $$ \text{geometric multiplicity of } \lambda_1 \leqslant \text{algebraic multiplicity of } \lambda_1 $$

Proof (more direct than in text): Suppose that $\lambda_1$ is an eigenvalue of $A$ with geometric multiplicity $g$, and let $\vv_1, \ldots, \vv_g$ be a basis for $E_{\lambda_1}$, so $$A \vv_i = \lambda_1 \vv_i \qtext{for each $i$}$$ Let $Q$ be an invertible matrix whose first $g$ columns are $\vv_1, \ldots, \vv_g$: $$ Q = [ \, \vv_1 \, \cdots \, \vv_g \qtext{other vectors} ] $$ Since $Q^{-1} Q = I$, we know that $Q^{-1} \vv_i = \ve_i$ for $1 \leqslant i \leqslant g$. Also, the first $g$ columns of $A Q$ are $\lambda_1 \vv_1, \ldots, \lambda_1 \vv_g$. So the first $g$ columns of $Q^{-1} A Q$ are $\lambda_1 \ve_1, \ldots, \lambda_1 \ve_g$. Therefore the matrix $Q^{-1} A Q$ has $\lambda_1$ as an eigenvalue with algebraic multiplicity at least $g$. But $Q^{-1} A Q$ has the same characteristic polynomial as $A$, so $\lambda_1$ must also have algebraic multiplicity at least $g$ for $A$.$\quad\Box$.

Summary of diagonalization: Given an $n \times n$ matrix $A$, we would like to determine whether $A$ is diagonalizable, and if it is, find the invertible matrix $P$ and the diagonal matrix $D$ such that $P^{-1} A P = D$. The result may depend upon whether you are working over $\R$ or $\C$.

Steps:

1. Compute the characteristic polynomial $\det(A - \lambda I)$ of $A$.
2. Find the roots of the characteristic polynomial and their algebraic multiplicities by factoring.
3. If the algebraic multiplicities don't add up to $n$, then $A$ is not diagonalizable, and you can stop. (If you are working over $\C$, this can't happen.)
4. For each eigenvalue $\lambda$, compute the dimension of the eigenspace $E_\lambda$. This is the geometric multiplicity of $\lambda$, and if it is less than the algebraic multiplicity, then $A$ is not diagonalizable, and you can stop.
5. Compute a basis for the eigenspace $E_\lambda$.
6. If for each eigenvalue the geometric multiplicity equals the algebraic multiplicity, then you take the $n$ eigenvectors you found and put them in the columns of a matrix $P$. Put the eigenvalues in the same order on the diagonal of a matrix $D$.
7. Check that $AP=PD$.

Note that step 4 only requires you to find the row echelon form of $A - \lambda I$, as the number of free variables here is the geometric multiplicity. In step 5, you solve the system.

Powers

More generally, $A^k = P D^k P^{-1}$. This is clearly an efficient way to compute powers! Note that we need to know $P$, not just $D$, to do this.

See Example 4.29 for a sample calculation. We'll illustrate this result with an example from Markov Chains.

Review of Markov chains

Since you must transition to some state, $P_{1j} + \cdots + P_{nj} = 1$. That is, the entries in each column sum to 1. Moreover, each entry $P_{ij} \geqslant 0$. Such a $P$ is called a stochastic matrix.

We can represent the current state of the system with a state vector $\vx \in \R^n$. The $i$th entry of $\vx$ may denote the number of people/objects in state $i$. Or we may divide by the total number, so the $i$th entry of $\vx$ gives the fraction of people/objects in state $i$. In this case, $\vx$ has non-negative entries that sum to 1 and is called a probability vector.

If $\vx_k$ denotes the state after $k$ time steps, then the state after one more time step is given by $$ \vx_{k+1} = P \vx_k . $$ It follows that $\vx_k = P^k \vx_0$. Therefore:

The $ij$ entry $(P^k)_{ij}$ of $P^k$ is the probability of going from state $\red{j}$ to state $\red{i}$ in $k$ steps.

A state $\vx$ such that $P \vx = \vx$ is called a steady state vector. This is the same as an eigenvector with eigenvalue 1. In Lecture 22, we proved:

Theorem 4.30: Every stochastic matrix has a steady state vector, i.e. it has $\lambda = 1$ as an eigenvalue.

We proved this using the fact that $P$ and $P^T$ have the same eigenvalues, and then noticing that the vector with all $1$'s is an eigenvector of $P^T$ with eigenvalue 1.

Example: We studied toothpaste usage, and had transition matrix $$ P = \bmat{ll} 0.70 & 0.20 \\ 0.30 & 0.80 \emat . $$ We noticed experimentally that a given starting state tends to the state $\coll {0.4} {0.6}$ and that $$ \bmat{ll} 0.70 & 0.20 \\ 0.30 & 0.80 \emat \coll {0.4} {0.6} = \coll {0.4} {0.6} . $$ We then found this steady state vector algebraically by solving $(I-P)\vx = \vec 0$. [It is equivalent to solve $(P-I)\vx = \vec 0$.]

Section 4.6: Markov chains

Theorem 4.31: Let $P$ be an $n \times n$ stochastic matrix. Then every eigenvalue $\lambda$ has $|\lambda| \leqslant 1$.

If in addition the entries of $P$ are all positive, then all eigenvalues besides $\lambda = 1$ have $|\lambda| < 1$.

Theorem 4.33: Let $P$ be an $n \times n$ stochastic matrix all of whose entries are positive. Then as $k \to \infty$, $P^k \to L$, a matrix all of whose columns are equal to the same vector $\vx$ which is a steady state probability vector for $P$.

Proof: We'll assume $P$ is diagonalizable: $Q^{-1} P Q = D$. So $P^k = Q D^k Q^{-1}$. As $k \to \infty$, $D^k$ approaches a matrix $D^*$ with $1$'s and $0$'s on the diagonal (by Theorem 4.31), which means that $P^k$ approaches $L = Q D^* Q^{-1}$.

Now that we know that $P^k$ has some limit $L$, we can deduce something about it. Since $\lim_{k \to \infty} P^k = L$, we have $$ PL = P\lim_{k \to \infty} P^k = \lim_{k \to \infty} P^{k+1} = L $$ This means that the columns of $L$ must be steady-state vectors for $P$. Since the columns of $P^k$ are probability vectors, the same must be true of the columns of $L$. It's not hard to show that $P$ has a unique steady-state probability vector $\vx$, so $L = [\, \vx \, \vx \, \cdots \, \vx \, ]$, as required.$\quad\Box$

Theorem 4.34: Let $P$ be an $n \times n$ stochastic matrix all of whose entries are positive, and let $\vx_0$ be any initial probability vector. Then as $k \to \infty$, $\vx_k \to \vx$, where $\vx$ is the steady state probability vector for $P$.

Proof: Suppose that $\vx_0$ has components $x_1, x_2, \ldots, x_n$. Then $$ \begin{aligned} \lim_{k \to \infty} \vx_k &= \lim_{k \to \infty} P^k \vx_0 = L \vx_0 \\ &= [\, \vx \, \vx \, \cdots \, \vx \, ] \vx_0 \\ &= x_1 \vx + x_2 \vx + \cdots + x_n \vx \\ &= (x_1 + x_2 + \cdots + x_n ) \vx = \vx \quad\Box \end{aligned} $$