Mascot image.
#Math#Vectors#Geometry
← Back to MA1A: Applied Linear Algebra

Lesson assets

No linked assets.

More on Matrices

Every theorem in Lesson 3PM treated a matrix as a single rectangle of entries. That is often what we want, but it throws away a second piece of information that is usually just as useful: a matrix arising from a problem often already carries internal structure, a corner to quarantine, a column that deserves to be singled out, a diagonal strip worth handling on its own. Drawing a few lines across the rectangle lets us name those pieces, reason about them one at a time, and recombine them through the addition and multiplication rules already in place.

Submatrices and Block Partitions

Definition 1 (Submatrix)

Let AFm×nA \in \mathbb{F}^{m \times n}. If some (possibly empty) collection of complete rows and some (possibly empty) collection of complete columns of AA are deleted, the rectangle that remains is called a submatrix of AA. Equivalently, a submatrix is determined by an ordered index set I{1,2,,m}I \subseteq \{1, 2, \ldots, m\} of rows to keep and J{1,2,,n}J \subseteq \{1, 2, \ldots, n\} of columns to keep.

Example 1 (Submatrices of a Small Rectangle)

Take

M=[123041120325]R3×4.M = \begin{bmatrix} 1 & -2 & 3 & 0 \\ 4 & 1 & -1 & 2 \\ 0 & 3 & 2 & -5 \end{bmatrix} \in \mathbb{R}^{3 \times 4}.

Keeping rows {1,2}\{1, 2\} with columns {1,2}\{1, 2\} returns the 2×22 \times 2 submatrix [1241]\begin{bmatrix} 1 & -2 \\ 4 & 1 \end{bmatrix}. Keeping only row 22 and column 33 returns the 1×11 \times 1 submatrix [1][-1]. Keeping row 11 with every column returns the row matrix [1  2  3  0][1 \; -2 \; 3 \; 0]. Keeping rows {1,3}\{1, 3\} with columns {1,4}\{1, 4\} returns [1005]\begin{bmatrix} 1 & 0 \\ 0 & -5 \end{bmatrix}. Each is a submatrix of MM obtained from a different choice of surviving rows and columns.

A submatrix on its own is only a selection. The construction below is what makes the idea productive in computation: rather than singling out one piece, cut the whole rectangle along several row and column boundaries and keep all of the resulting tiles.

Definition 2 (Partition and Block-Matrix)

A partition of AFm×nA \in \mathbb{F}^{m \times n} is the choice of dividing lines between specified rows and between specified columns, each running the full width or height of the array. The partition splits AA into an ordered rectangular array of submatrices called blocks, and AA written in terms of these blocks is called a block-matrix. We record a partition as

A=[Aαβ]α,β=1r,s,A = [A_{\alpha\beta}]_{\alpha,\, \beta = 1}^{r,\, s},

where AαβA_{\alpha\beta} is the block at block-position (α,β)(\alpha, \beta). Any block may itself be as small as a 1×11 \times 1 scalar.

Example 2 (Two Partitions of the Same Matrix)

The matrix MM of Example 1 admits many partitions. One natural cut places a horizontal line between rows 22 and 33 and a vertical line between columns 22 and 33:

M=[123041120325]=[M11M12M21M22],M = \left[ \begin{array}{cc|cc} 1 & -2 & 3 & 0 \\ 4 & 1 & -1 & 2 \\ \hline 0 & 3 & 2 & -5 \end{array} \right] = \begin{bmatrix} M_{11} & M_{12} \\ M_{21} & M_{22} \end{bmatrix},

with

M11=[1241],M12=[3012],M21=[0  3],M22=[2  5].M_{11} = \begin{bmatrix} 1 & -2 \\ 4 & 1 \end{bmatrix}, \quad M_{12} = \begin{bmatrix} 3 & 0 \\ -1 & 2 \end{bmatrix}, \quad M_{21} = [0 \; 3], \quad M_{22} = [2 \; -5].

A different cut isolates the first column, groups columns 22 and 33, and isolates the last column, while splitting rows only between rows 11 and 22:

M=[123041120325]=[M11M12M13M21M22M23],M = \left[ \begin{array}{c|cc|c} 1 & -2 & 3 & 0 \\ \hline 4 & 1 & -1 & 2 \\ 0 & 3 & 2 & -5 \end{array} \right] = \begin{bmatrix} M_{11} & M_{12} & M_{13} \\ M_{21} & M_{22} & M_{23} \end{bmatrix},

with

M11=[1],M12=[2  3],M13=[0],M21=[40],M22=[1132],M23=[25].M_{11} = [1], \quad M_{12} = [-2 \; 3], \quad M_{13} = [0], \qquad M_{21} = \begin{bmatrix} 4 \\ 0 \end{bmatrix}, \quad M_{22} = \begin{bmatrix} 1 & -1 \\ 3 & 2 \end{bmatrix}, \quad M_{23} = \begin{bmatrix} 2 \\ -5 \end{bmatrix}.

Neither layout is canonical; each simply exposes a different internal geometry of the same rectangle.

Every shape name from the opening classification of matrix types in Lesson 3PM transfers to the block-structured picture without alteration. A square block-matrix has equal numbers of row-blocks and column-blocks; a diagonal block-matrix has zero blocks off the main block-diagonal; an upper-triangular block-matrix has zero blocks below it, and so on. The blocks themselves are submatrices rather than scalars, but the label means what it did before.

Example 3 (Triangular and Diagonal Block-Matrices)

The partitioned matrices

[A11A120A22],[B1100B22]\begin{bmatrix} A_{11} & A_{12} \\ 0 & A_{22} \end{bmatrix}, \qquad \begin{bmatrix} B_{11} & 0 \\ 0 & B_{22} \end{bmatrix}

are an upper-triangular and a diagonal block-matrix respectively, exactly as in the opening classification of matrix types in Lesson 3PM, except that the scalar zeros of that section are now zero blocks of the sizes forced by the partition.

A square matrix admits a finer distinction. When the dividing lines are placed so that every diagonal block ends up square, the partition is well-suited to iteration, since the diagonal blocks can then be treated as square matrices in their own right.

Definition 3 (Symmetric Partition)

A partition of a square matrix AFn×nA \in \mathbb{F}^{n \times n} is symmetric when every diagonal block AααA_{\alpha\alpha} is itself square. Equivalently, the sequence of row-cut sizes matches the sequence of column-cut sizes.

Example 4 (A Symmetric Partition of a 3×33 \times 3 Matrix)

Cutting both the rows and the columns at the boundary between index 22 and index 33 partitions

N=[321041002]N = \left[ \begin{array}{cc|c} 3 & 2 & -1 \\ 0 & 4 & 1 \\ \hline 0 & 0 & 2 \end{array} \right]

symmetrically: the top-left block is the 2×22 \times 2 matrix [3204]\begin{bmatrix} 3 & 2 \\ 0 & 4 \end{bmatrix}, the bottom-right block is the 1×11 \times 1 scalar [2][2], and both diagonal blocks are square as the preceding definition requires.

Problem 1

Exhibit three different submatrices of

A=[10425310102210301211]A = \begin{bmatrix} 1 & 0 & 4 & -2 & 5 \\ 3 & -1 & 0 & 1 & 0 \\ 2 & 2 & -1 & 0 & 3 \\ 0 & 1 & 2 & 1 & -1 \end{bmatrix}

of sizes 2×32 \times 3, 3×23 \times 2, and 1×41 \times 4, each obtained from a different choice of surviving rows and columns.

Problem 2

Give two different symmetric partitions of

A=[3001024004201003]A = \begin{bmatrix} 3 & 0 & 0 & 1 \\ 0 & 2 & 4 & 0 \\ 0 & 4 & 2 & 0 \\ 1 & 0 & 0 & 3 \end{bmatrix}

and, for each partition, identify the diagonal blocks. For which of your partitions does the block-matrix become block-diagonal?

Addition and Multiplication in Block Form

The partition has so far only been a way to see a matrix. The pay-off is that addition and multiplication of matrices, once the block sizes line up, can be carried out block by block using the same formulas as at the scalar level. The entrywise addition and row-into-column multiplication introduced in Lesson 3PM are upgraded without any new idea.

Theorem 1 (Block Addition)

Let A,BFm×nA, B \in \mathbb{F}^{m \times n} be partitioned into the same block pattern,

A=[Aαβ]α,β=1r,s,B=[Bαβ]α,β=1r,s,A = [A_{\alpha\beta}]_{\alpha,\, \beta = 1}^{r,\, s}, \qquad B = [B_{\alpha\beta}]_{\alpha,\, \beta = 1}^{r,\, s},

with corresponding blocks AαβA_{\alpha\beta} and BαβB_{\alpha\beta} of the same size for every (α,β)(\alpha, \beta). Then

A+B=[Aαβ+Bαβ]α,β=1r,s,AB=[AαβBαβ]α,β=1r,s.A + B = [A_{\alpha\beta} + B_{\alpha\beta}]_{\alpha,\, \beta = 1}^{r,\, s}, \qquad A - B = [A_{\alpha\beta} - B_{\alpha\beta}]_{\alpha,\, \beta = 1}^{r,\, s}.
Proof

Matrix addition in Lesson 3PM is entrywise, and the two partitions share a single grid of dividing lines, so the partition of A+BA + B at block position (α,β)(\alpha, \beta) is the set of entries (aij+bij)(a_{ij} + b_{ij}) with (i,j)(i, j) ranging over that block. Reading those entries off as a submatrix gives (A+B)αβ=Aαβ+Bαβ(A + B)_{\alpha\beta} = A_{\alpha\beta} + B_{\alpha\beta}. The subtraction statement is identical with ++ replaced by -.

The hypothesis that the two partitions match is load-bearing: if BB were cut differently, corresponding blocks would not even have the same size, and the block formula would no longer make sense, even though the underlying entrywise sum still does.

Example 5 (Adding Two Matrices Block by Block)

Let

A=[121034220],B=[012110314],A = \left[ \begin{array}{cc|c} 1 & 2 & -1 \\ 0 & 3 & 4 \\ \hline 2 & -2 & 0 \end{array} \right], \qquad B = \left[ \begin{array}{cc|c} 0 & 1 & 2 \\ -1 & 1 & 0 \\ \hline 3 & 1 & -4 \end{array} \right],

both in R3×3\mathbb{R}^{3 \times 3} and partitioned identically. Block by block,

A11+B11=[1314],A12+B12=[14],A21+B21=[5  1],A22+B22=[4],A_{11} + B_{11} = \begin{bmatrix} 1 & 3 \\ -1 & 4 \end{bmatrix}, \quad A_{12} + B_{12} = \begin{bmatrix} 1 \\ 4 \end{bmatrix}, \quad A_{21} + B_{21} = [5 \; -1], \quad A_{22} + B_{22} = [-4],

and assembling returns

A+B=[131144514].A + B = \left[ \begin{array}{cc|c} 1 & 3 & 1 \\ -1 & 4 & 4 \\ \hline 5 & -1 & -4 \end{array} \right].

An entrywise sum produces the same result, confirming the block-addition theorem on this instance.

Block multiplication is slightly more delicate, because multiplication of ordinary matrices already required a conformability check. For block-matrices there are two such checks, one on the block grid itself and one inside the blocks.

Theorem 2 (Block Multiplication)

Let AFm×lA \in \mathbb{F}^{m \times l} be partitioned into blocks AαkA_{\alpha k} of size mα×lkm_\alpha \times l_k with 1αr1 \le \alpha \le r and 1kt1 \le k \le t, and let BFl×nB \in \mathbb{F}^{l \times n} be partitioned into blocks BkβB_{k\beta} of size lk×nβl_k \times n_\beta with 1kt1 \le k \le t and 1βs1 \le \beta \le s. Thus the column-cut of AA matches the row-cut of BB. Then

AB=[k=1tAαkBkβ]α,β=1r,s.AB = \left[ \sum_{k = 1}^{t} A_{\alpha k} B_{k\beta} \right]_{\alpha,\, \beta = 1}^{r,\, s}.
Proof

Fix a block position (α,β)(\alpha, \beta) and a scalar position (i,j)(i, j) inside that block. The value of (AB)ij(AB)_{ij} is the row-into-column product from Lesson 3PM,

(AB)ij=h=1laihbhj.(AB)_{ij} = \sum_{h = 1}^{l} a_{ih} b_{hj}.

Break the summation index hh at the boundaries of the column-cut of AA: letting LkL_k be the set of scalar indices lying in the kkth column-block of AA,

(AB)ij=k=1thLkaihbhj.(AB)_{ij} = \sum_{k = 1}^{t} \sum_{h \in L_k} a_{ih} b_{hj}.

Within the kkth inner sum, aiha_{ih} runs over row ii of the block AαkA_{\alpha k} and bhjb_{hj} runs over column jj of the block BkβB_{k\beta}, because the column-cut of AA at LkL_k was chosen to match the row-cut of BB. That inner sum is therefore (AαkBkβ)ij(A_{\alpha k} B_{k\beta})_{ij}, and reading off every (i,j)(i, j) in the block returns (AB)αβ=kAαkBkβ(AB)_{\alpha\beta} = \sum_k A_{\alpha k} B_{k\beta}.

Two conformability conditions are in play. The number of column-blocks of AA must equal the number of row-blocks of BB, so the index kk ranges over the same set on both sides. Additionally, each paired product AαkBkβA_{\alpha k} B_{k\beta} must itself be defined, meaning the column-count of AαkA_{\alpha k} equals the row-count of BkβB_{k\beta}. If either condition fails the block formula cannot be written, although the underlying product ABAB may still exist and can be computed entrywise in the ordinary way.

Example 6 (Computing a Product by Blocks)

Partition

A=[120301]=[A11  A12],B=[211102]=[B11B21],A = \left[ \begin{array}{c|cc} 1 & 2 & 0 \\ 3 & 0 & -1 \end{array} \right] = [A_{11} \; A_{12}], \qquad B = \left[ \begin{array}{cc} 2 & 1 \\ \hline 1 & -1 \\ 0 & 2 \end{array} \right] = \begin{bmatrix} B_{11} \\ B_{21} \end{bmatrix},

with A11=[13]A_{11} = \begin{bmatrix} 1 \\ 3 \end{bmatrix}, A12=[2001]A_{12} = \begin{bmatrix} 2 & 0 \\ 0 & -1 \end{bmatrix}, B11=[2  1]B_{11} = [2 \; 1], B21=[1102]B_{21} = \begin{bmatrix} 1 & -1 \\ 0 & 2 \end{bmatrix}. The column-cut of AA sits at position 11 and the row-cut of BB sits at position 11, so the two partitions are compatible. The block-multiplication theorem gives

AB=A11B11+A12B21=[2163]+[2202]=[4161].AB = A_{11} B_{11} + A_{12} B_{21} = \begin{bmatrix} 2 & 1 \\ 6 & 3 \end{bmatrix} + \begin{bmatrix} 2 & -2 \\ 0 & -2 \end{bmatrix} = \begin{bmatrix} 4 & -1 \\ 6 & 1 \end{bmatrix}.

A direct row-into-column computation of ABAB returns the same 2×22 \times 2 matrix.

A partition that cuts AA into its individual columns and BB into its individual rows sets t=lt = l, and the block formula collapses to

AB=k=1lakrk,AB = \sum_{k = 1}^{l} \mathbf{a}_k \mathbf{r}_k,

exactly the column-row decomposition proved earlier in Lesson 3PM. That is the extreme case of the block-multiplication theorem above; the general statement lets us pick any cut that is convenient for the problem, not only the finest one.

Problem 3

Let

A=[102101112100],B=[3100120000120021].A = \left[ \begin{array}{cc|cc} 1 & 0 & 2 & 1 \\ 0 & 1 & 1 & -1 \\ \hline 2 & -1 & 0 & 0 \end{array} \right], \qquad B = \left[ \begin{array}{cc|cc} 3 & 1 & 0 & 0 \\ 1 & 2 & 0 & 0 \\ \hline 0 & 0 & -1 & 2 \\ 0 & 0 & 2 & 1 \end{array} \right].

Write out the blocks of AA and BB implied by these partitions, and decide whether they are compatible for block multiplication in the sense of the block-multiplication theorem above. If they are, compute ABAB by the block formula. If not, say which conformability condition fails and compute ABAB directly.

Problem 4

Let AFn×nA \in \mathbb{F}^{n \times n} be partitioned as the block-diagonal matrix

A=[A1100A22]A = \begin{bmatrix} A_{11} & 0 \\ 0 & A_{22} \end{bmatrix}

with each AiiA_{ii} square. Using the block-multiplication theorem above, show that AA is idempotent (respectively nilpotent, respectively involutory) in the sense of Lesson 3PM if and only if each diagonal block AiiA_{ii} is. Give a small example of two 2×22 \times 2 diagonal blocks whose block assembly is a nilpotent 4×44 \times 4 matrix of index 22.

Polynomials in a Matrix

Building on the sections on matrix addition and multiplication in Lesson 3PM, the next natural step is to combine integer powers of a fixed square matrix into a scalar-weighted sum, mirroring ordinary polynomial evaluation at a number.

Definition 4 (Polynomial in a Matrix)

Let AFn×nA \in \mathbb{F}^{n \times n} and let

p(λ)=a0+a1λ+a2λ2++alλl,al0,p(\lambda) = a_0 + a_1 \lambda + a_2 \lambda^2 + \cdots + a_l \lambda^l, \qquad a_l \ne 0,

be a scalar polynomial of degree ll over F\mathbb{F}. The polynomial in AA associated to pp is

p(A)=a0In+a1A+a2A2++alAlFn×n,p(A) = a_0 I_n + a_1 A + a_2 A^2 + \cdots + a_l A^l \in \mathbb{F}^{n \times n},

with powers interpreted as in the definition of matrix powers from Lesson 3PM. The degree zero term a0a_0 is promoted to a0Ina_0 I_n so that every summand is an n×nn \times n matrix.

Example 7 (Evaluating a Quadratic at a Matrix)

Take p(λ)=2λ+λ2p(\lambda) = 2 - \lambda + \lambda^2 and

A=[010010101]R3×3.A = \begin{bmatrix} 0 & 1 & 0 \\ 0 & 1 & 0 \\ 1 & 0 & 1 \end{bmatrix} \in \mathbb{R}^{3 \times 3}.

The required powers are

A2=[010010111],A=[010010101],2I3=[200020002],A^2 = \begin{bmatrix} 0 & 1 & 0 \\ 0 & 1 & 0 \\ 1 & 1 & 1 \end{bmatrix}, \quad -A = \begin{bmatrix} 0 & -1 & 0 \\ 0 & -1 & 0 \\ -1 & 0 & -1 \end{bmatrix}, \quad 2 I_3 = \begin{bmatrix} 2 & 0 & 0 \\ 0 & 2 & 0 \\ 0 & 0 & 2 \end{bmatrix},

and entrywise addition gives

p(A)=2I3A+A2=[200020012].p(A) = 2 I_3 - A + A^2 = \begin{bmatrix} 2 & 0 & 0 \\ 0 & 2 & 0 \\ 0 & 1 & 2 \end{bmatrix}.

Every power of AA commutes with every other power of AA by the commuting-powers theorem from Lesson 3PM, so the polynomials p(A)p(A), with pp ranging over scalar polynomials over F\mathbb{F} and AA fixed, all commute with one another. That one observation is what lets the scalar identities of polynomial arithmetic transfer to the matrix side without adjustment.

Theorem 3 (Polynomial Evaluation Respects Sum, Product, and Division)

Let AFn×nA \in \mathbb{F}^{n \times n} and let p,q,dp, q, d be scalar polynomials over F\mathbb{F}.

  1. If p(λ)+q(λ)=h(λ)p(\lambda) + q(\lambda) = h(\lambda), then p(A)+q(A)=h(A)p(A) + q(A) = h(A).
  2. If p(λ)q(λ)=t(λ)p(\lambda) q(\lambda) = t(\lambda), then p(A)q(A)=t(A)p(A) q(A) = t(A).
  3. If p(λ)=q(λ)d(λ)+r(λ)p(\lambda) = q(\lambda) d(\lambda) + r(\lambda) with rr either zero or of degree strictly less than the degree of dd, then p(A)=q(A)d(A)+r(A)p(A) = q(A) d(A) + r(A).
Proof

For (1), addition of scalar polynomials collects coefficients of each power of λ\lambda, and the same collection performed on the matrix powers AkA^k returns h(A)h(A), using the algebraic laws of matrix addition from Lesson 3PM.

For (2), expand using bilinearity of matrix multiplication from Lesson 3PM:

p(A)q(A)=(i=0laiAi)(j=0mbjAj)=i,jaibjAiAj=i,jaibjAi+j,p(A) q(A) = \left(\sum_{i = 0}^{l} a_i A^i\right) \left(\sum_{j = 0}^{m} b_j A^j\right) = \sum_{i, j} a_i b_j A^i A^j = \sum_{i, j} a_i b_j A^{i + j},

where the final equality uses the exponent law for matrix powers from Lesson 3PM. The coefficient of AkA^k on the right is i+j=kaibj\sum_{i + j = k} a_i b_j, which is exactly the coefficient of λk\lambda^k in p(λ)q(λ)=t(λ)p(\lambda) q(\lambda) = t(\lambda). Hence p(A)q(A)=t(A)p(A) q(A) = t(A).

For (3), evaluate both sides of p=qd+rp = qd + r at AA, applying (1) to the outer sum and (2) to the product qdqd.

If d(λ)d(\lambda) divides p(λ)p(\lambda) at the scalar level, then r=0r = 0 and (3) collapses to p(A)=q(A)d(A)p(A) = q(A) d(A), so d(A)d(A) divides p(A)p(A) by the matrix factor q(A)q(A). Because p(A)p(A) and q(A)q(A) commute, left and right division give the same answer, and scalar-style factorisation arguments push through unchanged.

What does not carry over from scalar polynomials is the count of zeros. A scalar polynomial of degree ll has at most ll zeros in F\mathbb{F}, but the matrix equation p(A)=0p(A) = 0 can be satisfied by whole parametrised families of matrices. The reason is structural: matrix multiplication permits a product of two non-zero factors to vanish, a phenomenon already recorded in the matrix-multiplication section of Lesson 3PM.

Example 8 (A Quadratic with Infinitely Many Matrix Zeros)

Let p(λ)=(λ1)(λ+2)=λ2+λ2p(\lambda) = (\lambda - 1)(\lambda + 2) = \lambda^2 + \lambda - 2. By the polynomial-evaluation theorem above,

p(A)=A2+A2I=(AI)(A+2I)p(A) = A^2 + A - 2 I = (A - I)(A + 2 I)

for every AF2×2A \in \mathbb{F}^{2 \times 2}. The scalar equation p(λ)=0p(\lambda) = 0 has exactly the two roots 11 and 2-2, so the two scalar matrices A=IA = I and A=2IA = -2 I are obvious zeros of pp. They are not the only ones. For every bFb \in \mathbb{F} take

Ab=[1b02].A_b = \begin{bmatrix} 1 & b \\ 0 & -2 \end{bmatrix}.

Then

AbI=[0b03],Ab+2I=[3b00],A_b - I = \begin{bmatrix} 0 & b \\ 0 & -3 \end{bmatrix}, \qquad A_b + 2 I = \begin{bmatrix} 3 & b \\ 0 & 0 \end{bmatrix},

and a row-into-column computation gives (AbI)(Ab+2I)=0(A_b - I)(A_b + 2 I) = 0. The whole one-parameter family {Ab:bF}\{A_b : b \in \mathbb{F}\} therefore consists of zeros of the degree-two polynomial pp. Among them, only A0=diag(1,2)A_0 = \operatorname{diag}(1, -2) has II or 2I-2I as a scalar counterpart, confirming that the scalar count of roots is genuinely lost once AA is allowed to be a matrix.

Problem 5

Let p(λ)=λ23λ+2p(\lambda) = \lambda^2 - 3\lambda + 2, let q(λ)=λ+1q(\lambda) = \lambda + 1, and take

A=[1203].A = \begin{bmatrix} 1 & 2 \\ 0 & 3 \end{bmatrix}.

Compute p(A)p(A), q(A)q(A), the sum p(A)+q(A)p(A) + q(A), and the product p(A)q(A)p(A) q(A). Compute (p+q)(λ)(p + q)(\lambda) and (pq)(λ)(pq)(\lambda) at the scalar level, evaluate the resulting polynomials at AA, and verify directly that

p(A)+q(A)=(p+q)(A),p(A)q(A)=(pq)(A),p(A) + q(A) = (p + q)(A), \qquad p(A) q(A) = (p q)(A),

in accordance with the polynomial-evaluation theorem above.

Problem 6

Exhibit a one-parameter family of 2×22 \times 2 real matrices AA with A2=I2A^2 = I_2 that contains neither I2I_2 nor I2-I_2. Using the polynomial-evaluation theorem above applied to p(λ)=λ21p(\lambda) = \lambda^2 - 1, explain why each matrix in your family is a zero of pp, and connect the result to the involutory class from Lesson 3PM.

Problem 7

Let AFn×nA \in \mathbb{F}^{n \times n} be idempotent, so A2=AA^2 = A. Prove by induction on kk that Ak=AA^k = A for every k1k \ge 1, and deduce that for every scalar polynomial p(λ)=a0+a1λ++alλlp(\lambda) = a_0 + a_1 \lambda + \cdots + a_l \lambda^l,

p(A)=p(0)In+(p(1)p(0))A.p(A) = p(0) \, I_n + \bigl(p(1) - p(0)\bigr) A.

Hence every polynomial in an idempotent matrix is of the form

αIn+βA\alpha I_n + \beta A

for suitable scalars α,βF\alpha, \beta \in \mathbb{F}.

Exercises

Exercise 1 (A Plane of Equidistant Points)

Let PP and QQ be different points in R3\mathbb{R}^3 with position vectors p\mathbf{p} and q\mathbf{q}, and let EE be the collection of all vectors v\mathbf{v} satisfying

pv=qv.\|\mathbf{p}-\mathbf{v}\|=\|\mathbf{q}-\mathbf{v}\|.
  1. Show, by squaring both sides and using w2=ww\|\mathbf{w}\|^2=\mathbf{w}\cdot\mathbf{w}, that EE consists exactly of those v\mathbf{v} satisfying
v(pq)=12(p2q2).\mathbf{v}\cdot(\mathbf{p}-\mathbf{q})=\tfrac12\bigl(\|\mathbf{p}\|^2-\|\mathbf{q}\|^2\bigr).
  1. For
p=[343],q=[152],\mathbf{p}=\begin{bmatrix}3\\4\\3\end{bmatrix}, \qquad \mathbf{q}=\begin{bmatrix}-1\\5\\-2\end{bmatrix},

give an explicit equation for the plane EE.

Exercise 2 (Verifying Block Multiplication by Hand)

Let

A=[121304]=[A11  A12],B=[211312]=[B11B21].A = \left[ \begin{array}{c|cc} 1 & 2 & -1 \\ 3 & 0 & 4 \end{array} \right] = [A_{11} \; A_{12}], \qquad B = \left[ \begin{array}{cc} 2 & -1 \\ \hline 1 & 3 \\ -1 & 2 \end{array} \right] = \begin{bmatrix} B_{11} \\ B_{21} \end{bmatrix}.

Identify the size of each block, confirm that the conformability conditions of the block-multiplication theorem above are satisfied, and verify by direct computation that

AB=A11B11+A12B21.AB = A_{11} B_{11} + A_{12} B_{21}.
Exercise 3 (Block Addition in General)

Let A,BFm×nA, B \in \mathbb{F}^{m \times n} be partitioned into the same block pattern, with corresponding blocks AαβA_{\alpha\beta} and BαβB_{\alpha\beta} of matching size for every (α,β)(\alpha, \beta). Prove that

A+B=[Aαβ+Bαβ]α,β=1r,s,AB=[AαβBαβ]α,β=1r,s.A + B = [A_{\alpha\beta} + B_{\alpha\beta}]_{\alpha,\, \beta = 1}^{r,\, s}, \qquad A - B = [A_{\alpha\beta} - B_{\alpha\beta}]_{\alpha,\, \beta = 1}^{r,\, s}.
Exercise 4 (Four Coplanar Points)

Consider the four points

P=(0,0,0),Q=(0,1,2),R=(1,2,3),S=(1,0,7).P=(0,0,0), \qquad Q=(0,1,2), \qquad R=(1,-2,3), \qquad S=(1,0,7).
  1. Using difference vectors, show that no three of these points are collinear.

  2. Show that the four points all lie in the same plane in R3\mathbb{R}^3 by finding an equation for such a plane.

Exercise 5 (Block Multiplication in General)

Let AFm×lA \in \mathbb{F}^{m \times l} carry blocks AαkA_{\alpha k} of size mα×lkm_\alpha \times l_k for 1αr1 \le \alpha \le r and 1kt1 \le k \le t, and let BFl×nB \in \mathbb{F}^{l \times n} carry blocks BkβB_{k\beta} of size lk×nβl_k \times n_\beta for 1kt1 \le k \le t and 1βs1 \le \beta \le s, so that the column-cut of AA matches the row-cut of BB. Prove that

AB=[k=1tAαkBkβ]α,β=1r,s.AB = \left[ \sum_{k = 1}^{t} A_{\alpha k} B_{k\beta} \right]_{\alpha,\, \beta = 1}^{r,\, s}.
Exercise 6 (Polynomials in an Involutory Matrix)

Let AFn×nA \in \mathbb{F}^{n \times n} be involutory in the sense of Lesson 3PM, so that A2=InA^2 = I_n, and assume 12F\tfrac{1}{2} \in \mathbb{F}. Using the even and odd power formulas for an involutory matrix from Lesson 3PM, prove that for every scalar polynomial

p(λ)=a0+a1λ+a2λ2++alλl,p(\lambda) = a_0 + a_1 \lambda + a_2 \lambda^2 + \cdots + a_l \lambda^l,

the matrix polynomial collapses to

p(A)=12(p(1)+p(1))In+12(p(1)p(1))A.p(A) = \tfrac{1}{2}\bigl(p(1) + p(-1)\bigr) I_n + \tfrac{1}{2}\bigl(p(1) - p(-1)\bigr) A.

Compare the closed form with the earlier idempotent problem above, and describe explicitly the family of matrices of the form

αIn+βA\alpha I_n + \beta A

in which p(A)p(A) lies.

Exercise 7 (Angles Between Planes)

Let PP be the plane

2x2yz=3,2x-2y-z=3,

and let QQ be the plane containing the three non-collinear points

(1,0,2),(3,0,1),(3,1,2).(1,0,2), \qquad (3,0,-1), \qquad (3,1,2).
  1. Find the cosine of the angle between the xyxy-plane and the plane PP.

  2. Find the cosine of the angle between the xyxy-plane and the plane QQ.

  3. Explain in terms of normal vectors why PP and QQ are not parallel, and find a parametric form for their line of intersection.

Exercise 8 (A Polynomial on a Block-Diagonal Matrix with Bidiagonal Blocks)

Let

p(λ)=λ3+λ2+2λ+2p(\lambda) = \lambda^3 + \lambda^2 + 2\lambda + 2

and suppose

AC6×6A \in \mathbb{C}^{6 \times 6}

has the block-diagonal form

A=[λ0000000μ100000μ000000ν100000ν100000ν],A = \begin{bmatrix} \lambda_0 & 0 & 0 & 0 & 0 & 0 \\ 0 & \mu & 1 & 0 & 0 & 0 \\ 0 & 0 & \mu & 0 & 0 & 0 \\ 0 & 0 & 0 & \nu & 1 & 0 \\ 0 & 0 & 0 & 0 & \nu & 1 \\ 0 & 0 & 0 & 0 & 0 & \nu \end{bmatrix},

with one 1×11 \times 1 block, one 2×22 \times 2 upper-bidiagonal block at μ\mu, and one 3×33 \times 3 upper-bidiagonal block at ν\nu. Compute the relevant powers of each block directly, and using block addition from the theorem above prove that

p(A)=[p(λ0)000000p(μ)p(μ)00000p(μ)000000p(ν)p(ν)12p(ν)0000p(ν)p(ν)00000p(ν)].p(A) = \begin{bmatrix} p(\lambda_0) & 0 & 0 & 0 & 0 & 0 \\ 0 & p(\mu) & p'(\mu) & 0 & 0 & 0 \\ 0 & 0 & p(\mu) & 0 & 0 & 0 \\ 0 & 0 & 0 & p(\nu) & p'(\nu) & \tfrac{1}{2} p''(\nu) \\ 0 & 0 & 0 & 0 & p(\nu) & p'(\nu) \\ 0 & 0 & 0 & 0 & 0 & p(\nu) \end{bmatrix}.

Here

p(λ)=3λ2+2λ+2,p(λ)=6λ+2,p'(\lambda) = 3\lambda^2 + 2\lambda + 2, \qquad p''(\lambda) = 6\lambda + 2,

the first and second ordinary derivatives of the scalar polynomial pp. If you are taking MA1B alongside this course, these formulas should already look familiar; if not, you may simply use them here as given.

Exercise 9 (Powers of a Block-Triangular Matrix)

Let

A=[A11A120I]A = \begin{bmatrix} A_{11} & A_{12} \\ 0 & I \end{bmatrix}

be symmetrically partitioned, with A11A_{11} square and II the identity block of matching order. Using the block-multiplication theorem above together with induction on nn, prove that for every positive integer nn,

An=[A11npn(A11)A120I],A^n = \begin{bmatrix} A_{11}^n & p_n(A_{11}) A_{12} \\ 0 & I \end{bmatrix},

where pn(λ)=λn1λ1=1+λ+λ2++λn1p_n(\lambda) = \dfrac{\lambda^n - 1}{\lambda - 1} = 1 + \lambda + \lambda^2 + \cdots + \lambda^{n - 1} is the polynomial coming from the geometric-series identity, evaluated at A11A_{11} in the sense of the definition of a polynomial in a matrix above.