modules

Module theory follows on the heels of ring theory, in the sense that modules are exactly vector spaces defined over rings. Module theory makes up a large portion of representation theory and character theory, and touches a bit of category theory (which we might go over.)

Here’s my notes on module theory.

December 27, 2023. Exploration 1: Modules and vector spaces
Questions:
- How do we use matrices to represent mathematical objects?
This entire exploration is actually going to be about matrices. Specifically, we’re going to talk about $R$ -matrices, which are matrices with entries in a ring $R$ . It turns out the study of $R$ -matrices subsumes the study of modules, which we will see by the end of this exploration.

What are modules? It is easiest to start with free modules $R^n$ , which are additive abelian groups of $R$ -vectors, which are row or column vectors whose $n$ entries are in $R$ , where addition and scalar multiplication are defined in the usual way: $\left[\begin{matrix}a_1\\\vdots\\a_n\end{matrix}\right]+\left[\begin{matrix}b_1\\\vdots\\b_n\end{matrix}\right]=\left[\begin{matrix}a_1+b_1\\\vdots\\a_n+b_n\end{matrix}\right]\quad\text{ and }\quad r\left[\begin{matrix}a_1\\\vdots\\a_n\end{matrix}\right]=\left[\begin{matrix}ra_1\\\vdots\\ra_n\end{matrix}\right]\text{ for }r\in R$

These should be familiar from linear algebra, only that the entries of $R$ -vectors are elements in the ring $R$ . In this context, elements of $R$ are called scalars and are not in $R^n$ themselves — only $R$ -vectors are in $R^n$ . Since $R^n$ is composed of $R$ -vectors, it is straightforward to describe $n\times m$ $R$ -matrices as maps $R^m\to R^n$ , like in linear algebra: $\left[\begin{matrix} r_{11}&r_{12}&\ldots&r_{1m}\\ r_{21}&r_{22}&\ldots&r_{2m}\\ \vdots&\vdots&\ddots&\vdots\\ r_{n1}&r_{n2}&\ldots&r_{nm} \end{matrix}\right]\left[\begin{matrix}v_1\\v_2\\\vdots\\v_n\end{matrix}\right]=\left[\begin{matrix}\sum_{i=1}^m v_ir_{1i}\\\sum_{i=1}^m v_ir_{2i}\\\vdots\\\sum_{i=1}^m v_ir_{ni}\end{matrix}\right]$ where the LHS vector is in $R^m$ and the RHS vector is in $R^n$ .

Theorem: Every ring $R$ is a free module over itself.

Every element in the ring can be treated as a scalar or a (one-dimensional) $R$ -vector, so scalar multiplication is given by multiplication in $R$ .

This theorem introduces some ambiguity: writing $R$ could either mean the ring $R$ , or the one-dimensional module over $R$ . For clarity, here we write $R$ when we mean the ring, and we always write $R$ -modules using the letters $M$ or $N$ or $V$ .

In this section, we describe $R$ -modules beyond free modules.

$R$ -modules are more general than free modules $R^n$ , however. They are additive abelian groups of elements $v$ (not necessarily $R$ -vectors!), where elements $r\in R$ come into play by defining a scalar multiplication $r\cdot v$ governed by the following laws:
- Scalar distributivity: $r\cdot(v+w)=rv+rw$
- Vector distributivity: $(r_1+r_2)\cdot v=r_1\cdot v+r_2\cdot v$
- Scalar associativity: $r_1\cdot(r_2\cdot v)=(r_1\cdot r_2)v$
- Scalar unit law: $1\cdot v=v$
Theorem: The $\ZZ$ -modules are exactly the abelian groups.

( $\to$ ) Every $R$ -module is an additive abelian group by definition.

( $\from$ ) If you have an additive abelian group, you can define scalar multiplication $n\cdot a$ as the sum of $n$ copies of $a$ . This satisfies the laws above, so we obtain a $\ZZ$ -module.
Like with groups, rings, and fields, we can define a couple concepts for $R$ -modules:
- A submodule $W$ of an $R$ -module $M$ is the same as subgroup/subring/subfield – it’s just an $R$ -module that is a subset of another $R$ -module. The zero module $\{0\}$ is always a submodule of $M$ , and so is $M$ itself.
- The quotient module $M/N$ is always defined – there is no need for $N$ to be a special kind of module the same way you need ideals for quotient rings and normal subgroups for quotient groups. This is because we can always eliminate the elements generating $N$ from the elements generating $M$ without breaking any of the module laws.
- The direct product of two $R$ -modules is called the direct sum $N\oplus M$ . It works identically to the direct product (elements are pairs $(n,m)$ with $n\in N$ and $m\in M$ ) except we write $\oplus$ instead of $\otimes$ , referring to the additive nature of $R$ -modules.
In this section, we describe how to construct $R$ -modules.

Just like how groups and rings can be finitely generated, we can construct an $R$ -module $M$ out of a generating set of elements $\{v_1,v_2,\ldots,v_n\}$ , known as a spanning set for $M$ . We can write $M=\text{span}(\{v_1,v_2,\ldots,v_n\})$ , and say that $M$ consists of $R$ -linear combinations of its spanning set $\{v_1,v_2,\ldots,v_n\}$ , i.e. $M$ is the set $M=\text{span}(\{v_1,v_2,\ldots,v_n\})=\left\{\sum_i r_iv_i\right\}$ where the coefficients $r_i$ are elements of $R$ , and each sum is finite (though the spanning set can be infinite).

No matter the spanning set, it is always possible to generate the zero element of the $R$ -module — just take the $R$ -linear combination where all the coefficients $r_i$ are zero. Of interest is whether this is the only way to generate the zero element. If there is another way to generate the zero element, i.e. using a nonzero coefficient somewhere, then we say that the spanning set is linearly dependent. Otherwise, there is only one way to generate the zero vector, and we have a linearly independent spanning set, also known as a basis.

If we consider spanning sets as a map $R^n\to M$ from a $n$ -tuple of coefficients in $R$ to elements of the $R$ -module, then a spanning set that is a basis is interesting. This is because all spanning sets are trivially surjective (the spanning set generates the module), and if the spanning set is a basis, then there is only one way to generate the zero vector — thus the kernel is trivial, thus the map is injective and bijective. This bijection between $n$ -tuples of coefficients and elements of an $R$ -module means that elements of the $R$ -module are essentially $n$ -tuples of coefficients, also known as $R$ -vectors. Sound familiar?
Theorem: A free module $R^n$ is exactly an $R$ -module generated by a basis.

Every spanning set gives rise to a surjective map $R^n\to M$ from $n$ -tuples of coefficients to elements of the module.

If the spanning set is a basis, then only the zero tuple maps to the zero element, i.e. the map has a trivial kernel and is injective. Therefore a basis represents a bijection between $n$ -tuples of coefficients, i.e. $R$ -vectors, and elements of the module: $\left[\begin{matrix}a_1\\\vdots\\a_n\end{matrix}\right]\quad\text{represents}\quad a_1v_1+a_2v_2+\ldots+a_nv_n$

Thus the elements of the $R$ -module can be expressed as $R$ -vectors, which is the definition of a free module $R^n$ .
For a free module $R^n$ , the standard basis $e_i$ is defined as the columns of the $n\times n$ identity matrix $I_n$ : $e_1=\left[\begin{matrix}1\\0\\\vdots\end{matrix}\right], e_2=\left[\begin{matrix}0\\1\\\vdots\end{matrix}\right],\ldots,e_n=\left[\begin{matrix}\vdots\\0\\1\end{matrix}\right]$

Linear algebra is exactly when $R$ is a field $F$ . In fact, $F$ -modules are exactly $F$ -vector spaces. We will draw parallels to linear algebra extensively throughout this exploration.
Theorem: Every spanning set of a $F$ -module $V$ contains a basis, if $F$ is a field.

Let $\{v_1,v_2,\ldots,v_n\}$ be an arbitrary spanning set for $V$ . If this spanning set is linearly independent, it is a basis and we are done. Otherwise, assume that $\{v_1,v_2,\ldots,v_n\}$ is linearly dependent.

If $\{v_1,v_2,\ldots,v_n\}$ is linearly dependent, then zero can be represented by some $F$ -linear combination $\sum_i r_iv_i=0$ with a nonzero coefficient. WLOG let $r_1$ be one of the nonzero coefficients.

Since we’re working in a field $F$ , $r_1$ has a multiplicative inverse $\frac{1}{r_1}$ . Then the corresponding $v_1$ can be written as a $F$ -linear combination of the other elements: $\begin{aligned} 0&=r_1v_1+r_2v_2\ldots+r_nv_n\\ v_1&=-\frac{r_2}{r_1}v_1-\ldots-\frac{r_n}{r_1}v_n \end{aligned}$

Since every $v_1$ can be represented as a $F$ -linear combination of the others, its contribution to the spanning set is redundant. We can remove $v_1$ from the spanning set because $\{v_1,v_2,\ldots,v_n\}$ and $\{v_2,\ldots,v_n\}$ generate the same module.

If the resulting spanning set $\{v_2,\ldots,v_n\}$ is not a basis, we can repeat this process to further decrease the size of the spanning set. This process terminates either at a basis or at the empty set, which is trivially a basis (for the zero module). Thus the original spanning set $\{v_1,v_2,\ldots,v_n\}$ contains a basis for $V$ .
In this section, we start exploring $R$ -matrices for real.

At the beginning we mentioned that this entire exploration is going to be about studying $R$ -matrices. One big reason is because $R$ -matrices are homomorphisms of free $R$ -modules. For instance: $A=\left[\begin{matrix}3&2&-4\\-1&4&0\\1&1&-1\end{matrix}\right]$ is a $3\times 3$ $R$ -matrix. Left multiplication by $A$ corresponds to a homomorphism $\varphi:R^3\to R^3$ . $B=\left[\begin{matrix}2&4&1&3&8\\5&8&8&1&2\\1&0&3&4&2\end{matrix}\right]$ is a $3\times 5$ $R$ -matrix. Left multiplication by $B$ corresponds to a homomorphism $\varphi:R^5\to R^3$ . Typically we just refer to the $R$ -matrix itself as the homomorphism, falling back to $\varphi$ when we’re talking about homomorphisms in the abstract.

Our first question is, which $R$ -matrices correspond to isomorphisms?

This translates to the question: when are $R$ -matrices invertible? This requires the concept of a determinant, which should be familiar from linear algebra. For $n\times n$ (square) $R$ -matrices, the determinant $\det(A)$ is the unique function $R^n\to R$ that is:
- Multilinear: linear in each row and column.
- Alternating: Swapping two rows or columns swaps the sign, which also implies that two identical rows/columns makes the function zero (since only $0$ is invariant under swapping those rows/columns).
- Equal to $1$ for the identity matrix $I$ .
This unique function is $\det(A)=\sum_{\sigma\in S_n}\sgn(\sigma)\prod_{i=1}^n a_{i,\sigma(i)}$ , which:
- Is multilinear: each term of the sum is a product $\prod_{i=1}^n a_{i,\sigma(i)}$ of matrix entries of in each column $\sigma(i)$ .
- Is alternating: since a property of $\sgn(\sigma)$ is that it switches sign when you swap two elements in the permutation (i.e. you swap two rows or columns).
- Is $1$ for identity: because the summand $\prod_{i=1}^n a_{i,\sigma(i)}$ is only nonzero for the identity permutation $\sigma=1$ where $\sgn(\sigma)=1$ .
Although the above shows that the given function is a determinant, the proof that it is the unique function that is the determinant requires some more advanced tools we’ll introduce later. Let’s first determine when an square $R$ -matrix is invertible.

Therefore: We can write the determinant recursively as $\det(A)=\sum_{j=1}^n a_{ij}C_{ij}$ , where $C_{ij}=(-1)^{i+j}\det(M_{ij})$ , for an arbitrary row index $i$ . This is known as the cofactor expansion of the determinant along the row $i$ .

Recall the definition of the determinant from above: $\det(A)=\sum_{\sigma\in S_n}\sgn(\sigma)\prod_{i=1}^n a_{i,\sigma(i)}$ The determinant $\det(A)$ can be defined recursively. First, we factor a single $a_{i,\sigma(i)}$ out of the product: $\det(A)=\sum_{\sigma\in S_n}\sgn(\sigma)a_{i,\sigma(i)}\prod_{k=1,k\ne i}^n a_{k,\sigma(k)}$ Now note that since the sum goes over every permutation $\sigma\in S_n$ , the $i$ we just took out gets permuted to every value $j$ from $1$ to $n$ , $(n-1)!$ times each (since there are $(n-1)!$ permutations in $S_n$ that takes $i$ to any given value). This means $a_{ij}$ appears $(n-1)!$ times in the sum, so we can move $a_{i,\sigma(i)}$ out as $a_{ij}$ iff there are $(n-1)!$ permutations for which $\sigma(i)=j$ . But this is true since there are $(n-1)!$ permutations of the remaining $n-1$ elements as well. $\det(A)=\sum_{j=1}^n\left(a_{ij}\sum_{\sigma\in S_n,\sigma(i)=j}\sgn(\sigma)\prod_{k=1,k\ne i}^n a_{k,\sigma(k)}\right)$ Note that the inner sum and product $\sum_{\sigma\in S_n,\sigma(i)=j}\sgn(\sigma)\prod_{k=1,k\ne i}^n a_{k,\sigma(k)}$ is very close to the formula for the determinant of a matrix that has the row $i$ removed (since $a_{i,\sigma(i)}$ is excluded from the product) and the column $j$ removed (since $\sigma$ is defined to send $i$ to $j$ , meaning the column $j$ only appears in the form $a_{i,\sigma(i)}=a_{ij}$ , but the row $i$ is excluded so it never appears.)

Define the minor $M_{ij}$ as the matrix $A$ with the row $i$ and column $j$ removed. Then we have $\det(M_{ij})=\sum_{\sigma\in S_n,\sigma(i)=j}\sgn(\sigma)\prod_{k=1,k\ne i}^n a_{k,\sigma(k)}$ This is almost the determinant function as defined above, except we’re relying on $n$ instead of the actual size of the matrix $n-1$ . To correct this, we can take permutations $\tau\in S_{n-1}$ and use the entries $b_{ij}$ of the minor instead: $\det(M_{ij})=\sum_{\tau\in S_{n-1}}\sgn(\sigma)\prod_{k=1}^n b_{k,\tau(k)}$ Note that we’re still mentioning $\sigma$ , which we define as the permutation in $S_n$ corresponding to $\tau$ where $\sigma(i)=j$ . To convert $\sgn(\sigma)$ to $\sgn(\tau)$ , let’s move the $i$ th row to the top of the matrix, which is $i-1$ transpositions, multiplying the determinant by $(-1)^{i-1}$ due to the alternating property of the determinant. Similarly, we multiply the determinant by $(-1)^{j-1}$ to represent moving the $j$ th column to the left of the matrix. This ensures that all the rows and columns of the minor are contiguous, so the permutations in $S_{n-1}$ use the correct indices. Overall, we’re multiplying $\sgn(\sigma)$ by a total of $(-1)^{i+j-2}=(-1)^{i+j}$ to get $\sgn(\tau)$ , and thus: $(-1)^{i+j}\det(M_{ij})=\sum_{\tau\in S_{n-1}}\sgn(\tau)\prod_{k=1}^n b_{k,\tau(k)}$ which matches our original definition of determinant for the minor $M_{ij}$ . The value $(-1)^{i+j}\det(M_{ij})$ is known as the cofactor $C_{ij}$ of the matrix $A$ .

Finally, we can plug this back into our original expression for $\det(A)$ to get: $\det(A)=\sum_{j=1}^n a_{ij}C_{ij}$ for some fixed row index $i$ .

The cofactor expansion can be used to directly define the inverse of an $R$ -matrix.

Therefore: $A^{-1}=\frac{\adj(A)}{\det(A)}$ , which only exists when $\det(A)$ is a unit in the ring.

Recall the cofactor expansion: $\det(A)=\sum_{j=1}^n a_{ij}C_{ij}$ Notice how similar it is to matrix multiplication: $[A\cdot B]_{ik}=\sum_{j=1}^n a_{ij}b_{jk}$ In fact, let $\adj(A)$ be the adjugate matrix of $A$ where each entry $b_{ij}$ is equal to $C_{ji}$ (note the swapped indices). Then we have exactly $[A\cdot\adj(A)]_{ik}=\sum_{j=1}^n a_{ij}b_{jk}=\sum_{j=1}^n a_{ij}C_{kj}=\begin{cases}\det(A)&\text{if }i=k\\0&\text{if }i\ne k\end{cases}=[\det(A)\cdot I]_{ik}$ The reason that $\sum_{j=1}^n a_{ij}C_{kj}$ is zero when $i\ne k$ is because if you copy the values in row $i$ to row $k$ , resulting in a matrix $A'$ , the sum becomes $\sum_{j=1}^n a_{kj}C_{kj}=\det(A')$ where $C_{kj}$ is unchanged since it is calculated on a minor with the row $k$ removed. But since $A'$ has two identical rows, then by the alternating property of determinants, $\det(A')=0$ . Thus the sum is zero.

So we have proved the following: $A\cdot\adj(A)=\det(A)\cdot I$ Note that if $\det(A)$ is a unit in the ring $R$ , then we have $A\cdot\frac{\adj(A)}{\det(A)}=I$ implying that $\frac{\adj(A)}{\det(A)}$ is the right inverse of $A$ . A similar argument shows that $\frac{\adj(A)}{\det(A)}$ is the left inverse of $A$ as well.

Corollary: An $R$ -matrix is invertible iff its determinant is a unit in $R$ .

Since you can only divide by units in a ring, and the inverse matrix is exactly the adjugate divided by the determinant, the determinant must be a unit in order to define the inverse matrix.

This should be familiar from linear algebra, where you have $F$ -matrices invertible iff its determinant is nonzero. Of course, this is exactly because the only nonunit in a field $F$ is zero. For an example that isn’t a field, note that the units of $\ZZ$ are $\pm 1$ . Thus a $\ZZ$ -matrix is invertible iff its determinant is $\pm 1$ .

For us, invertible matrices are significant because, being reversible, their corresponding homomorphism is an isomorphism. More importantly, this means multiplying an $R$ -matrix $A$ by an invertible matrix $P$ is the same as composing an isomorphism $\tau$ to the $R$ -module homomorphism $\sigma$ corresponding to $A$ . Right-multiplying by $P$ is precomposition by $\tau$ , and left-multiplying by $P$ is postcomposition by $\tau$ . As we know, isomorphisms are essentially renamings, so composing a homomorphism by isomorphisms gives you essentially the same homomorphism.

So we have proved the existence and a formula for the inverse of an $R$ -matrix. To build on this, let’s introduce how they let us manipulate $R$ -matrices for the rest of this exploration.

We introduce the three row and column operations.

First, we may swap two rows of a matrix by left-multiplying by a suitable elementary matrix. ${\color{blue}\left[\begin{matrix}1&0&0\\0&0&1\\0&1&0\end{matrix}\right]} \left[\begin{matrix}1&1&1\\2&2&2\\3&3&3\end{matrix}\right] =\left[\begin{matrix}1&1&1\\3&3&3\\2&2&2\end{matrix}\right]$

Second, we may multiply any row by a unit (here, assume $10$ is a unit) by left-multiplying by another kind of elementary matrix. ${\color{blue}\left[\begin{matrix}10&0&0\\0&1&0\\0&0&1\end{matrix}\right]} \left[\begin{matrix}1&1&1\\2&2&2\\3&3&3\end{matrix}\right] =\left[\begin{matrix}10&10&10\\2&2&2\\3&3&3\end{matrix}\right]$

Third, we may add a multiple of any row to another row, again by left-multiplying by a third kind of elementary matrix. ${\color{blue}\left[\begin{matrix}1&0&0\\0&1&0\\10&0&1\end{matrix}\right]} \left[\begin{matrix}1&1&1\\2&2&2\\3&3&3\end{matrix}\right] =\left[\begin{matrix}1&1&1\\2&2&2\\13&13&13\end{matrix}\right]$

Column operations use the same (transposed) elementary matrices, except you right-multiply instead of left-multiply.

Obviously these operations can be undone by doing the operation again – you can un-swap by swapping again, un-scale by scaling by the inverse unit, and un-add the row $cr_i$ by adding $-cr_i$ . This implies that the elementary matrices are invertible. In fact, the inverse of an elementary matrix represents the inverse operation.

When you apply row and column operations to a matrix, you’re essentially doing a clever renaming of the domain (column operations) and codomain (row operations). The resulting matrix represents the same homomorphism but on the renamed domain/codomain.

How do the row and column operations affect the determinant? We can give a short proof of each:

Theorem: Swapping two rows flips the sign of the determinant.

This is exactly the alternating property of determinants.

Theorem: Scaling a row by a unit will scale the determinant by the same unit.

This follows immediately from the multilinear property of determinants.

Theorem: Adding a multiple of a row to another row does not change the determinant.

If row $r_i$ becomes row $r_i+cr_j$ , then by multilinearity, $\begin{aligned} &\det(\text{matrix where }r_i=r_i+cr_j)\\ &=\det(\text{original matrix where }r_i=r_i)\\ &+c\cdot\det(\text{matrix where }r_i=r_j) \end{aligned}$ But the last matrix has two identical rows $r_i$ and $r_j$ , so by the alternating property its determinant is zero. Therefore the resulting matrix has determinant equal to that of the original matrix.

Corollary: Row and column operations multiply the determinant by a unit.

Follows immediately from the above three theorems, which show that the three row and column operations multiply the determinant by $-1,c,1$ respectively ( $c$ a unit).

Theorem: The determinant of a product of matrices $\det(AB)$ is the product of their individual determinants $\det(A)\det(B)$ .

Using the formula for determinant for $AB$ , we have $\begin{aligned} \det(AB)&=\sum_{\sigma\in S_n}\sgn(\sigma)\prod_{i=1}^n(AB)_{i,\sigma(i)}\\ &=\sum_{\sigma\in S_n}\sgn(\sigma)\prod_{i=1}^n\sum_{k=1}^n a_{ik}b_{k,\sigma(i)} \end{aligned}$ Summing over the index $k$ from $1$ to $n$ can be done in any order since addition is commutative. Thus we can add them in an arbitrary order. Since we’re multiplying together a sum $\prod_i\sum_k$ , by the distributive property the result is essentially adding together all the ways of picking one term from each sum and multiplying them together, just like how $(a+b)(c+d)=ac+ad+bc+bd$ . This is the same as adding together all the ways of picking one $k$ for each $i$ , so we can express this as the sum of using all permutations taking $i$ to $k$ : $\begin{aligned} \det(AB)&=\sum_{\sigma\in S_n}\sgn(\sigma)\sum_{\tau\in S_n}\prod_{i=1}^n a_{i,\tau(i)}b_{\tau(i),\sigma(i)}\\ &=\sum_{\tau\in S_n}\sum_{\sigma\in S_n}\sgn(\sigma)\left(\prod_{i=1}^n a_{i,\tau(i)}\right)\left(\prod_{i=1}^n b_{\tau(i),\sigma(i)}\right)\\ &=\sum_{\tau\in S_n}\left(\prod_{i=1}^n a_{i,\tau(i)}\right)\left(\sum_{\sigma\in S_n}\sgn(\sigma)\prod_{i=1}^n b_{\tau(i),\sigma(i)}\right) \end{aligned}$ We can reorder the last product by applying $\tau^{-1}$ to the indices $i$ : $\begin{aligned} \det(AB)&=\sum_{\tau\in S_n}\left(\prod_{i=1}^n a_{i,\tau(i)}\right)\left(\sum_{\sigma\in S_n}\sgn(\sigma)\prod_{i=1}^n b_{i,\sigma(\tau^{-1}(i))}\right) \end{aligned}$ Such a reordering is equivalent to multiplying by the signature of $\tau$ : $\begin{aligned} \det(AB)&=\sum_{\tau\in S_n}\left(\prod_{i=1}^n a_{i,\tau(i)}\right)\left(\sum_{\sigma\in S_n}\sgn(\sigma)\sgn(\tau)\prod_{i=1}^n b_{i,\sigma(i)}\right)\\ &=\left(\sum_{\tau\in S_n}\sgn(\tau)\prod_{i=1}^n a_{i,\tau(i)}\right)\left(\sum_{\sigma\in S_n}\sgn(\sigma)\prod_{i=1}^n b_{i,\sigma(i)}\right)\\ &=\det(A)\det(B) \end{aligned}$

In this section, we show how row and column operations can be used to simplify a matrix.
Therefore: If $R$ is a Bézout domain, then any $R$ -matrix $A$ can be reduced to Smith normal form, a diagonal matrix where each nonzero entry divides the next.

Using row and column operations, you can always reduce an $R$ -matrix down into the block form $\left[\begin{matrix}D&0\\0&0\end{matrix}\right]$ , where $D$ is a diagonal $R$ -matrix, i.e. all its nonzero entries are on the main diagonal.

There is a way to get $D$ into the form such that the diagonal elements $d_1,d_2,\ldots$ divide each order: $d_1\mid d_2\mid d_3\mid\ldots$ . This form is called Smith normal form. Any $R$ -matrix can be reduced to Smith normal form, provided $R$ is a Bézout domain, which is an integral domain in which every sum of principal ideals is principal. For instance, in PIDs every ideal is principal, so PIDs are Bézout domains.

Theorem: Bézout domains define a GCD $d=\gcd(a,b,\ldots)$ and have Bézout’s identity: for every $\{a,b,\ldots\}$ , there exist $\{x,y,\ldots\}$ where $ax+yb+\ldots=\gcd(a,b,\ldots)$ .

The fact that every sum of principal ideals is principal $(a)+(b)+\ldots=(d)$ implies two things:

First, since $(d)$ contains each of $\{(a),(b),\ldots\}$ , we know that $d$ is a common divisor of $\{a,b,\ldots\}$ .

Second, $(a)+(b)+\ldots=(d)$ represents all the linear combinations of $\{a,b,\ldots\}$ . Since a common divisor of $\{a,b,\ldots\}$ must divide all linear combinations of those elements $=(d)$ , every common divisor must divide $d$ in particular.

From this, we can conclude two things:

Since every common divisor divides $d$ , it is in fact the greatest common divisor $\gcd(a,b,\ldots)$ .

$\gcd(a,b,\ldots)$ is some linear combination of $\{a,b,\ldots\}$ , i.e. we have Bézout’s identity $ax+yb+\ldots=\gcd(a,b,\ldots)$ for some $\{x,y,\ldots\}$ .

Method: To put a matrix into Smith normal form, choose $d_1$ to be the GCD of all its entries. Then use row/column operations to isolate $d_1$ in the upper left such that its row and column are zero except for $d_1$ , getting a block matrix $\left[\begin{matrix} d_1&\begin{matrix}0&\ldots\end{matrix}\\ \begin{matrix}0\\\vdots\end{matrix}&\left[\begin{matrix}&&\\&B&\\&&\end{matrix}\right] \end{matrix}\right]$ where $B$ is the minor $M_{11}$ where every entry is divisible by $d_1$ because of how we picked $d_1$ . Because of Bézout’s identity, it’s always possible to isolate the GCD using row and column operations this way. Repeating the process for the minor $B$ gives you Smith normal form. Here’s a short example:

$\begin{aligned} &~\left[\begin{matrix}2&-1\\1&2\end{matrix}\right]\\ =&~\left[\begin{matrix}1&-1\\3&2\end{matrix}\right]{\color{red}\left[\begin{matrix}1&0\\1&1\end{matrix}\right]^{-1}}&\text{ add col }2\text{ to col }1\\ =&~\left[\begin{matrix}1&0\\3&5\end{matrix}\right]{\color{red}\left(\left[\begin{matrix}1&1\\0&1\end{matrix}\right]\left[\begin{matrix}1&0\\1&1\end{matrix}\right]\right)^{-1}}&\text{ add col }1\text{ to col }2\\ =&~\left[\begin{matrix}1&0\\3&5\end{matrix}\right]{\color{red}\left[\begin{matrix}2&1\\1&1\end{matrix}\right]^{-1}}\\ =&~{\color{blue}\left[\begin{matrix}1&0\\-3&1\end{matrix}\right]^{-1}}\left[\begin{matrix}1&0\\0&5\end{matrix}\right]{\color{red}\left[\begin{matrix}2&1\\1&1\end{matrix}\right]^{-1}}&\text{ subtract }3(\text{row }1)\text{ from row }2\\ \end{aligned}$
Taking the Smith normal form of a matrix $A$ can be written $A={\color{blue}U}A'{\color{red}V}$ , where ${\color{blue}U},{\color{red}V}$ are the elementary matrices used to reduce $A$ to the diagonal matrix $A'$ . The entries $d_i$ along the diagonal of $A'$ are called the invariant factors of $A$ , because you will always arrive at these factors in the Smith normal form regardless of how you do the row and column operations.

Theorem: The invariant factors of an $R$ -matrix $A$ are unique up to units.

At each step, the invariant factor $d_i$ is chosen as the GCD of the remaining elements, and since GCDs are unique up to units, the invariant factors are unique up to units.

Theorem: The determinant of a diagonal $R$ -matrix is equal to the product of its entries.

In the definition of determinant, $\det(A)=\sum_{\sigma\in S_n}\sgn(\sigma)\prod_{i=1}^n a_{i,\sigma(i)}$ you can see that unless $\sigma(i)=i$ for each $i$ , the product will include a zero term that makes the whole product zero. This is because $a_{ij}$ for $i\ne j$ is zero in a diagonal matrix. Thus the only $\sigma$ that results in a nonzero term is the identity permutation, and so for diagonal matrices the formula simplifies down to $\det(A)=\prod_{i=1}^n a_{ii}$

Corollary: The determinant of an $R$ -matrix is equal to the product of its invariant factors, up to units.

This follows immediately from the fact that the determinant of a matrix equals (up to units) the determinant of its Smith normal form, which is equal to the product of the diagonal entries, which are the invariant factors.

Theorem: The invariant factors of a matrix $A$ are all ones iff $A$ is invertible.

If $A$ is invertible, then its determinant is a unit, which can only be a product of units. But since the determinant is the product of invariant factors up to units, that’s the same as saying every invariant factor is a unit, and therefore associate to $1$ .

Corollary: The Smith normal form of an invertible matrix is the identity matrix.

Corollary: Every invertible matrix over a Bézout domain $R$ factors into elementary matrices.

For matrices over a Bézout domain, the Smith normal form is obtained via row and column operations, a process that only factors out corresponding elementary matrices. By the previous theorem, the resulting Smith normal form of an invertible matrix is the identity matrix, which is also an elementary matrix. Thus the original matrix factors into elementary matrices.

In this section, we abstractly represent relationships between $R$ -modules via $R$ -matrices.

We already know that a $n\times m$ $R$ -matrix $M$ defines a homomorphism $R^m\to R^n$ . Let’s bring back our example matrix $A$ and explore some properties of this homomorphism: $A=\left[\begin{matrix}3&2&-4\\-1&4&0\\1&1&-1\end{matrix}\right]$ First, the columns of an $R$ -matrix $M$ generate its image (also known as column space), which is written $\im M=MR^m=\span(\text{columns of }M)$ For instance with $A$ above, $\im A=AR^3=\span\left( \left[\begin{matrix}3\\-1\\1\end{matrix}\right], \left[\begin{matrix}2\\4\\1\end{matrix}\right], \left[\begin{matrix}-4\\0\\-1\end{matrix}\right]\right)$ Second, the solutions to $AX=0$ comprise the kernel (also known as null space) of $A$ . The kernel of a diagonal matrix $D$ is relatively straightforward. The equation $\left[\begin{matrix}d_1&&\\&d_2&\\&&\ddots\end{matrix}\right] \left[\begin{matrix}x_1\\x_2\\\vdots\end{matrix}\right]=0$ is equivalent to the system of equations $\begin{aligned} d_1x_1&=0\\ d_2x_2&=0\\ &\ldots \end{aligned}$ In an integral domain, $x_i$ is zero when $d_i\ne 0$ , and $x_i$ can take on any value when $d_i=0$ . This means that the kernel of a diagonal matrix is $\span(e_i,e_j,\ldots)$ where $\{i,j,\ldots\}$ are the indices $i$ where $d_i=0$ . In particular, if there are no zeroes along the main diagonal of a diagonal matrix, then its kernel is trivial.
Theorem: An $R$ -matrix defines an injective $R$ -module homomorphism iff its kernel is trivial.

( $\to$ ) If $A:R^m\to R^n$ is injective, then $AX=AY$ implies $X=Y$ . In particular, elements $X$ of the kernel, where $AX=0$ , must be zero because $AX=0\implies AX=A0\implies X=0$ .

( $\from$ ) If $A:R^m\to R^n$ has a trivial kernel, then consider $AX=AY$ , which implies $AX-AY=0$ and $A(X-Y)=0$ . Since the kernel is trivial, we have $X-Y=0$ and thus $X=Y$ , showing that $A$ is injective.
Theorem: An $R$ -matrix defines an surjective $R$ -module homomorphism iff its image is the codomain.

By definition. Since surjective means the whole codomain is mapped to, it’s the same as saying that the image is equal to the codomain.

Corollary: Since the columns of $A:R^m\to R^n$ generate $\im A$ by definition, if $A$ is surjective, then its columns generate $R^n$ .

Theorem: The kernel of an $R$ -matrix $A$ over a Bézout domain $R$ is the same as the kernel of its Smith normal form $QA'P^{-1}$ .

For example, let’s take the Smith normal form of $A$ : $A=QA'P^{-1}= \left[\begin{matrix}-3&-2&2\\-1&-1&2\\-2&-1&1\end{matrix}\right] \left[\begin{matrix}1&0&0\\0&2&0\\0&0&3\end{matrix}\right] \left[\begin{matrix}1&0&-2\\-3&1&4\\-1&1&1\end{matrix}\right]$ and then solve $QA'P^{-1}X=0$ with these steps: $\begin{aligned} QA'P^{-1}X=&~0\\ \left[\begin{matrix}-3&-2&2\\-1&-1&2\\-2&-1&1\end{matrix}\right] \left[\begin{matrix}1&0&0\\0&2&0\\0&0&3\end{matrix}\right] \left[\begin{matrix}1&0&-2\\-3&1&4\\-1&1&1\end{matrix}\right]X=&~0\\ \left[\begin{matrix}1&0&0\\0&2&0\\0&0&3\end{matrix}\right] \left(\left[\begin{matrix}1&0&-2\\-3&1&4\\-1&1&1\end{matrix}\right]X\right)=&~0&\text{ left-multiply by }Q^{-1}\\ \left[\begin{matrix}1&0&-2\\-3&1&4\\-1&1&1\end{matrix}\right]X=&\left[\begin{matrix}0\\0\\0\end{matrix}\right]&\text{ solve for }P^{-1}X\\ X=\left[\begin{matrix}1&0&-2\\-3&1&4\\-1&1&1\end{matrix}\right]^{-1}&\left[\begin{matrix}0\\0\\0\end{matrix}\right]&\text{ left-multiply by }P\\ X=&\left[\begin{matrix}0\\0\\0\end{matrix}\right]\\ \end{aligned}$ In this case, $A$ has a trivial kernel $\{0\}$ . You can see that this is precisely because its Smith normal form has no zero entries along the diagonal. We’ll be talking about the kernel in the abstract, defined as the solutions to $AX=0$ . We can see that the kernel is easily calculable when $R$ is a Bézout domain, but it may not be as easy for other rings.
Theorem: The kernel and image of a homomorphism $\sigma:M\to N$ is a submodule of the domain $M$ and codomain $N$ respectively.

Contains $0$ : $\sigma$ must send $0$ to $0$ .

Closed under addition: because $\sigma(m)+\sigma(n)=\sigma(m+n)$ , if both $m$ and $n$ are in the kernel/image so is $m+n$ .

Closed under scalar multiplication: because $r\sigma(m)=\sigma(rm)$ , if $m$ is in the kernel/image so is $rm$ .
Like with groups, an $R$ -module is simple if has no proper non-trivial submodules.

Theorem: Any homomorphism $\sigma:M\to N$ between simple modules $M$ and $N$ is either trivial or an isomorphism.

Since kernel and image must both be submodules of $M$ and $N$ respectively, and simple modules have only two possible choices for submodules (trivial submodule and the module itself), the only possible homomorphisms are when the kernel is trivial (implying that the image is $N$ , therefore an isomorphism) and when the kernel is $M$ (implying that the image is trivial, therefore the trivial homomorphism).

In this section, we describe properties of $R$ -modules with only diagrams.

Here we’re going to represent $R$ -matrices abstractly as homomorphisms $\sigma$ on $R$ -modules.

Consider two homomorphisms $\sigma:A\to B,\tau:B\to C$ where the image of $\sigma$ is the kernel of $\tau$ . In other words, everything $\sigma$ maps to will get mapped to $0$ by $\tau$ . This property is called exactness, and we say this sequence is exact at $B$ , and any sequence of homomorphisms $A\xrightarrow{\sigma}B\xrightarrow{\tau}C\xrightarrow{\upsilon}\ldots$ with the property that “the image of one homomorphism is the kernel of the next” is called an exact sequence.

In fact, we have discussed the concept of exact sequences before for groups. Exact sequences for groups and exact sequences for $R$ -modules are precisely the same concept, but since $R$ -modules have a richer structure, we may use exact sequences in ways that we cannot do for groups.

To recap, we showed the following for exact sequences for groups. These are true for $R$ -modules as well:
- Theorem: The exact sequence $\{0\}\xrightarrow{\sigma}B\xrightarrow{\tau}C$ implies $\tau$ is injective.
- Theorem: The exact sequence $A\xrightarrow{\sigma}B\xrightarrow{\tau}\{0\}$ implies $\sigma$ is surjective.
- First Isomorphism Theorem: Given $\sigma:M\to N$ , $M/\ker\sigma\iso\im\sigma$ .
- An exact sequence of the form $\{0\}\to A\xrightarrow{\sigma}B\xrightarrow{\tau}C\to\{0\}$ is known as a short exact sequence, with the following properties:
  
  Theorem: $A$ is isomorphic to a submodule of $B$ .
  
  Theorem: If $\sigma$ is an inclusion map, then $C\iso B/A$ .
  
  Splitting lemma: If there is an left inverse homomorphism $\sigma^{-1}:B\to A$ , or if there is an right inverse homomorphism $\tau^{-1}:C\to B$ , then the sequence splits and we have $B\iso A\oplus C$ .
  
  A short exact sequence describes an embedding of $A$ into $B$ , and then $C\iso B/A$ captures the structure that $A$ doesn’t account for. When $\sigma$ or $\tau$ has an inverse, then the splitting lemma says you can directly piece together the structures $A$ and $C$ to obtain $B$ , via $A\oplus C\iso B$ .
For an example of how exact sequences are used, we can look at projections. A projection is a idempotent endomorphism, i.e. a homomorphism $\pi:M\to M$ such that $\pi^2=\pi$ .
Theorem: A module $M$ can be decomposed into the direct sum $M=\ker\pi\oplus\im\pi$ iff a projection $\pi:M\to M$ exists.

( $\from$ ) Any direct sum $A\oplus B$ has a canonical left and right projection, so the forward direction is trivial.

( $\to$ ) For any homomorphism like $\pi$ , we can always write the short exact sequence $\{0\}\to\ker\pi\xrightarrow{\iota}M\xrightarrow{\pi}\im\pi\to\{0\}$ because the kernel of $\pi$ is exactly the image of the injective inclusion map $\iota:\ker\pi\to M$ , and $\pi$ is surjective onto its image $\im\pi$ .

Since $\pi$ is idempotent, it acts like the identity on its image $\im\pi$ . Since that implies $\pi(m)=m$ for $m\in\im\pi$ , we have $\pi(\pi(m))=m$ , meaning $\pi$ itself is its own inverse for elements in its image $m\in\im\pi$ .

By the splitting lemma, since $\pi:M\to\im\pi$ has an inverse $\pi':\im\pi\to M$ , the sequence splits and therefore $M=\ker\pi\oplus\im\pi$ .
In this section, we utilize exact sequences in the context of $R$ -modules specifically.

Take this exact sequence, for instance: $R^m\xrightarrow{A}R^n\xrightarrow{\sigma}M\to\{0\}$ This exact sequence is known as a presentation of the $R$ -module $M$ . The idea is that $M$ is entirely described by the presentation, and the presentation is entirely described by the $R$ -matrix $A$ , called the presentation matrix. In other words, it only takes a single $R$ -matrix, $A$ , to define all you need to know about an $R$ -module $M$ . Let’s see why this is.
Theorem: In the above exact sequence, $M$ is isomorphic to the elements $X$ of $R^m$ where $AX=0$ .

By the first isomorphism theorem on $\sigma$ , we have $R^n/\ker\sigma\iso\im\sigma$ . However:

The exact sequence $R^n\to M\to\{0\}$ implies that the homomorphism $\sigma:R^n\to M$ is surjective, and thus $M$ is isomorphic to the image of $\sigma$ . Thus $\im\sigma\iso M$ .

By exactness, $\im A=\ker\sigma$ .

Then we have $R^n/\im A\iso M$ . This is the same as saying $M$ is $R^n$ , except we send all elements that are linear combinations of the columns of $A$ to $0$ ( $AX=0$ ).
In the context of presentations, the basis of $R^n$ (which is the codomain of $A$ ) are known as generators of $M$ , and the columns of $A$ describes relations $AX=0$ on the generated elements $X$ . So the $R$ -matrix $A$ , by virtue of defining both the generators of $M$ , and the relations on those generators of $M$ , completely determines the $R$ -module $M$ . Because every $R$ -matrix $A$ defines a $R$ -module this way, we say that $M\iso R^n/\im A$ is the cokernel of $A$ . It is so named because there is a duality between the kernel and the cokernel. Where the kernel is comprised of the elements in the domain sent to zero by $A$ , and a trivial kernel implies injectivity of $A$ , the cokernel is comprised of the structure in the codomain not sent to zero by $A$ , and a trivial cokernel implies surjectivity of $A$ .

Theorem: An $R$ -matrix defines an surjective homomorphism iff its cokernel is trivial.

Surjectivity of $A:R^n\to R^n$ means $\im A=R^n$ . But then the cokernel $R^n/\im A$ is trivial if and only if $\im A=R^n$ .

Hopefully this makes clear the reason we only study $R$ -matrices in this exploration — because every $R$ -module can be defined as the cokernel of a suitable $R$ -matrix, studying $R$ -matrices completely subsumes the need to study $R$ -modules directly!

In this section, we describe how to derive the $R$ -matrix that presents a given $R$ -module.

Let’s do the reverse. How do you construct the $A$ -matrix (the presentation matrix) that presents a given (finitely generated) $R$ -module? Here is a step-by-step:
- Step 1: Find its generators $v_i$ .
- Step 2: Find the relations among those generators.
- Step 3: Express the relation vectors as columns of the presentation matrix.
- Step 4: Reduce the matrix.
(Note that a non-finitely generated $R$ -module will ask for infinitely many $v_i$ , so describing that class of $R$ -modules would require a different process.)

Example:

Let $M$ be a $\ZZ$ -module generated by $(v₁,v₂,v₃)$ under the relations $\begin{aligned} 3v_1+2v_2+v_3&=0\\ 8v_1+4v_2+2v_3&=0\\ 7v_1+6v_2+2v_3&=0\\ 9v_1+6v_2+v_3&=0 \end{aligned}$ Express each relation as a column of the presentation matrix $A$ : $A=\left[\begin{matrix}3&8&7&9\\2&4&6&6\\1&2&2&1\end{matrix}\right]$ so that the above system of relations can be summarized as $\left[\begin{matrix}v_1&v_2&v_3\end{matrix}\right]A=\left[\begin{matrix}0&0&0&0\end{matrix}\right]$ And theoretically we are done, since $A$ presents $M$ .

However, we can reduce this matrix down a bit. The following simplifying operations also do not change the isomorphism class of the module presented by $A$ :
- Row or column operations.
- Removing a column of zeroes (since that represents the zero relation $0v_1+0v_2+0v_3=0$ )
- Removing the $i$ th row and $j$ th column whenever the $j$ th column consists of zeros except for one $1$ (since that represents the relation $v_i=0$ , and so we can remove $v_i$ from consideration)
Reducing our matrix above:

$\begin{aligned} &~\left[\begin{matrix}3&8&7&9\\2&4&6&6\\1&2&2&1\end{matrix}\right]\\ =&~\left[\begin{matrix}0&2&1&6\\0&0&2&4\\1&2&2&1\end{matrix}\right]&\begin{matrix}r_1-3r_3\to r_1\\r_2-2r_3\to r_2\end{matrix}\\ =&~\left[\begin{matrix}2&1&6\\0&2&4\end{matrix}\right]&\text{remove }c_1,r_3\\ =&~\left[\begin{matrix}2&1&6\\-4&0&-8\end{matrix}\right]&r_2-2r_1\to r_2\\ =&~\left[\begin{matrix}-4&-8\end{matrix}\right]&\text{remove }c_2,r_1\\ =&~\left[\begin{matrix}-4&0\end{matrix}\right]&-2c_1+c_2\to c_2\\ =&~\left[\begin{matrix}-4\end{matrix}\right]&\text{remove }c_1\\ =&~\left[\begin{matrix}4\end{matrix}\right]&-1c_1\to c_1\\ \end{aligned}$

Since the image of $\left[\begin{matrix}4\end{matrix}\right]:\ZZ\to\ZZ$ is $4\ZZ$ , the big $3\times 4$ matrix above actually just presents the module $\ZZ/4\ZZ$ ! Thus $M=\ZZ/4\ZZ$ .

We’ve seen how it’s done for a specific finitely generated $R$ -module, and this approach generalizes for all finitely generated $R$ -modules. Thus we can say that every finitely generated $R$ -module $M$ can be presented by an $R$ -matrix. But how do you tell when $M$ is finitely generated in the first place?

In this section, we learn the conditions under which an $R$ -module is finitely generated.

To see whether an $R$ -module $M$ is finitely generated (and thus presentable), let’s try building one.

Start with the zero submodule $W_0=\{0\}$ of $M$ . At every step, add an element $v_i\in M$ outside of $W_{i-1}$ to get $W_i$ , generated by $v_i$ and the old $W_{i-1}$ .

The fact of the matter is, if $M$ is finitely generated, then eventually there are no more elements $v$ outside of $M$ to add, so the process stops. Otherwise, the process keeps going. One way to characterize this is known as the ascending chain condition (ACC) on submodules: “There is no infinite strictly increasing chain $W_1<W_2<\ldots$ of submodules of $M$ .” Satisfying the ACC is the same as saying that the process stops.
Theorem: Every submodule $W\le M$ is finitely generated iff it satisfies the ACC.

( $\to$ ) Towards contradiction, say you have an infinitely increasing chain $W_1<W_2<\ldots$ of submodules of $M$ . Let $U$ be the infinite union of these submodules, so $U$ is a submodule of $M$ too, and in particular $U$ is a finitely generated superset of every submodule. This means that $U$ appears at some point in the infinite chain, but it is also a superset of every submodule of the infinite chain, so some point $\ldots<U<\ldots<U$ . This implies $W_i=W_{i+1}$ at some point. So every infinite chain of $W_i$ is not strictly increasing, contradiction.

( $\from$ ) Let $w_i$ be a special set of generators of W, inductively constructed by having $W_0=\{0\}$ be generated by $w_0=\{\}$ , and having $W'\le W$ be generated by the existing $w_i$ together with an element v in $W$ but not $W'$ . Since we’re constructing a strictly increasing chain of submodules $W_i$ , and there is no infinite strictly increasing chain of submodules by the ACC, this process ends with a finite number of generators in $w_i$ .
Recall that integral domains that satisfy the ascending chain condition on principal ideals (ACCP) are exactly the factorization domains. We can say something similar here. A noetherian $R$ -module is one where every submodule is finitely generated, i.e. satisfies the ACC (by the above proof). Every Noetherian $R$ -module can be described by a presentation matrix.

In this section, we explore the properties of Noetherian rings.

More generally, a noetherian ring is one where every ideal is finitely generated, i.e satisfies the ACC. Note that it’s possible for a ring to be finitely generated but have ideals that are not finitely generated, so proving noetherianity requires proving facts about ideals, not the whole ring. We’re going to prove a number of theorems that let us work with such rings:

Theorem: Surjective homomorphisms preserve noetherianity. (If the domain is Noetherian, so is the codomain.)

Because surjective homomorphisms preserve all subset relations, and thus preserves the ACC.

Theorem: Quotients preserve noetherianity. (If $R$ is Noetherian, so are its quotients $R/I$ .)

Because the canonical map $\pi:R\to R/I$ is surjective, and surjective homomorphisms preserve noetherianity.
Theorem: Finite direct sums preserve noetherianity. (If $R,S$ are Noetherian, so is the direct sum $R\oplus S$ .)

Ideals in $R\oplus S$ are in the form $I_R\oplus I_S$ , where $I_R$ is an ideal in $R$ and $I_S$ is an ideal in $S$ . This is precisely because there is no way for elements of $R$ to influence elements of $S$ and vice versa.

Because $R,S$ are Noetherian, $I_R$ and $I_S$ are finitely generated, from which you can construct a finite set of generators for $I_R\oplus I_S$ .
Corollary: Free modules over Noetherian rings are Noetherian. (If $R$ is a Noetherian ring, then $R^n$ is a Noetherian $R$ -module.)

You can take the direct sum of $n$ copies of $R$ to get $R^n$ . Since finite direct sum preserves noetherianity, when $R$ is Noetherian, $R^n$ is Noetherian.

Theorem: Every ideal of a Noetherian ring $R$ is contained in a maximal ideal, except $R$ itself.

All Noetherian rings satisfy ACC, which says every ideal is contained in a chain up to some point, i.e. the maximal ideal.
Hilbert Basis Theorem: If $R$ is Noetherian, so is $R[x]$ .

The goal is to prove that an arbitrary ideal $I$ for $R[x]$ is finitely generated, given the ACC for $R$ .

Lemma: The leading coefficients of degree $n$ polynomials in any ideal $I$ of $R[x]$ form an ideal $C_n$ of $R$ .

If $c_i\in C_n$ is the leading coefficient of $f_i\in I$ , then $c_i+c_j$ is the leading coefficient of $f_i+f_j$ , and $rc_i$ (for arbitary $r\in R$ ) is the leading coefficient of $rf_i$ . $f_i+f_j$ and $rf_i$ are both in $I$ since it’s an ideal, and therefore $c_i+c_j$ and $rc_i$ are both in $C_n$ , making it an ideal.

We have $C_n\subseteq C_{n+1}$ because for every $c_i\in C_n$ and corresponding degree $n$ polynomial $f_i$ , $xf_i$ exists (due to $I$ being an ideal) as a degree $n+1$ polynomial with the same leading coefficient $c_i\in C_{n+1}$ .

Thus the $C_n$ form a chain of ideals in $R$ , which stops strictly increasing at some point due to $R$ being Noetherian and therefore satisfying the ACC. Let $C_N$ be the last strictly increasing ideal in this chain (the one where $C_n=C_N$ for all $n\ge N$ ).

Take $J$ as all the polynomials $f_i$ corresponding to the generators $c_i$ of $C_N$ . Each $f_i$ has degree at most $N$ . This is a finite number of polynomials since each $C_n$ is finitely generated, due to $R$ being Noetherian. Clearly $J\subseteq I$ , as the $f_i$ are taken from $I$ .

We can prove any polynomial $f\in I$ can be expressed in terms of these $f_i$ , i.e. $I\subseteq J$ .

First, if $\deg f>N$ then pick some polynomial $g\in J$ with the same leading coefficient as $f$ , which exists since $C_{\deg f}=C_N$ . Then $f'=f-x^{\deg f-\deg g}g$ is a lower-degree polynomial in $I$ due to $I$ being an ideal. Repeat this process until you obtain a polynomial whose degree is $\le N$ , reducing it to the second case:

Then if $\deg f\le N$ , then the leading coefficient of $f$ is in $C_N$ and therefore you can generate $f$ using the elements in $J$ . To see this, notice that you can match the leading coefficient by adding polynomials corresponding to the leading coefficients in C_N, and you can do the same process for every lower coefficient.

Since $I=J$ , and $J$ is finitely generated, any arbitrary ideal $I$ of $R[x]$ is finitely generated, thus $R[x]$ is Noetherian.
Corollary: If $R$ is a Noetherian ring, so is anything in the form $R[x]/(f)$ because quotients preserve noetherianity.
Theorem: The Noetherian Bézout domains are exactly the PIDs.

( $\to$ ) Every ideal is finitely generated in a Noetherian ring, and every finitely generated ideal is principal in a Bézout domain. Thus every ideal is principal in a Noetherian Bézout domain.

( $\from$ ) In a PID, every ideal is principal making it trivially Bézout, and every ideal is generated by one element making it trivially Noetherian.
Structure Theorem for Modules over a PID: if $M$ is a finitely generated $R$ -module where $R$ is a PID, then it can be uniquely decomposed as a direct sum of quotients of $R$ . To be specific, $M\iso R/(f_1)\oplus\ldots\oplus R/(f_n)$ where $f_i$ are elements of $R$ that divide each other: $f_1\mid\ldots\mid f_n$ .

We can always find the $R$ -presentation matrix $A$ for any finitely generated Noetherian $R$ -module $M$ (so that $M\iso R^n/\im A$ ). In this case, $M$ is given as finitely generated over a PID, and is Noetherian because PIDs are Noetherian.

The Smith normal form is defined for any matrix defined over a Bézout domain. PIDs are also Bézout domains, so taking the Smith normal form of $A$ gives $A=PDQ^{-1}$ , obtaining a diagonal matrix $D$ of invariant factors $f_i$ .

In particular, the resulting diagonal matrix $D=\left[\begin{matrix}f_1&&&\\&f_2&&\\&&\ddots&\\&&&f_n\end{matrix}\right]$ has the property that $f_1\mid\ldots\mid f_n$ , which is a property of the Smith normal form of any matrix.

Then, since $P$ and $Q$ are invertible (i.e. isomorphisms), we have $\im A\iso\im PDQ^{-1}\iso\im D$

The image of a matrix is the span of its columns. For a diagonal matrix $D$ , each column corresponds to a scalar multiplication of a single dimension of $M$ , and so the matrix itself can be expressed as a direct sum $\bigoplus_i f_iR$ , with one submodule $f_iR$ for each dimension.

Then we have $M\iso R^n/\im A\iso R^n/(f_1R\oplus f_2R\oplus\ldots\oplus f_nR)$ which by the Third Isomorphism Theorem is isomorphic to $M\iso R/(f_1)\oplus R/(f_2)\oplus\ldots\oplus R/(f_n)$

For the proof of uniqueness, any other decomposition into some $R/(g_1)\oplus R/(g_2)\oplus\ldots\oplus R/(g_n)$ would imply that the $g_i$ are invariant factors of $A$ appearing on the diagonal of its Smith normal form, which must be the same invariant factors as $f_i$ since the Smith normal form is unique.
This theorem is a generalization of the structure theorem for finitely generated abelian groups, which is the specific case where $R=\ZZ$ , using the fact that the $\ZZ$ -modules are isomorphic to the abelian groups.

In this section, we examine the components of finitely generated $R$ -modules.

In the above theorem, we found that any finitely generated $R$ -module (over a PID $R$ ) can be expressed as a direct sum of components in the form $R/(f)$ . $R$ -modules in the form $R/(f)$ are called cyclic $R$ -modules.

In general, a cyclic $R$ -module $M$ is one in which every element of the module can be expressed as scalar multiples of a single element $m\in M$ . Because of this, cyclic $R$ -modules can be written as $Rm$ . If $m\in R$ , then $Rm$ is isomorphic to the principal ideal $(m)$ .

Theorem: $R/(f)$ is a cyclic $R$ -module.

Since every element of $R/(f)$ is a coset in the form $[f]$ , the multiplicative identity is the coset $[1]$ . But the multiplicative identity generates all elements of $R/(f)$ , thus $R/(f)$ is a cyclic $R$ -module.

So finitely generated $R$ -modules are composed of a direct sum of cyclic $R$ -submodules $R/(f)$ .

Recall that quotienting a ring/module essentially sends all quotiented elements to zero. In other words, $R/(f)$ means “ $R$ , where $f=0$ ”. Any relation can be expressed this way – if you want to apply the relation $a=b$ , then you quotient by $a-b$ to send it to zero.

Cyclic $R$ -modules $Rm$ make this even simpler. Since every possible element can be expressed in the form $am$ for some element $a$ , every relation on $Rm$ can be expressed in the form $am=0$ instead of $a-b=0$ .

In fact, the elements $a$ that make $am=0$ true are given a special name: the annihilator of $m$ .
Theorem: Given a cyclic $R$ -module $Rm$ , the elements $a\in R$ such that $am=0$ form an ideal of $R$ .

It is enough to prove that these elements $a$ are closed under subtraction and by multiplication with elements $r\in R$ .

If $am=0$ and $bm=0$ , then $(a-b)m=am-bm=0-0=0$ , thus elements $a$ are closed under subtraction.

If $am=0$ , then for arbitrary $r\in R$ we have $(ra)m=r(am)=r0=0$ .

Thus these elements $a\in R$ where $am=0$ form an ideal of $R$ .
So this ideal of $R$ , containing all elements $a\in R$ where $am=0$ , is called the annihilator of $m\in Rm$ , and is written $\Ann_R(m)$ . The idea is that the annihilator consists of all elements that send the generator $m$ (and therefore all elements) to zero. In fact:
Theorem: Every cyclic module $Rm$ is isomorphic to $R/Ann_R(m)$ .

Let $\sigma: R\to Rm$ be the map $r\mapsto rm$ .

By definition of cyclic module, the image of $\sigma$ is all of $Rm$ , thus $\sigma$ is surjective.

The kernel of $\sigma$ consists of elements $a$ where $am=0$ , which is exactly $\Ann_R(m)$ .

Then by the First Isomorphism Theorem we have $R/\ker\sigma\iso\im\sigma$ , which is $R/\Ann_R(m)\iso Rm$ .
Thus we have that every quotient by a principal ideal $R/(f)$ is cyclic (with $m=[1]$ ), and every cyclic module $Rm$ is isomorphic to the quotient $R/\Ann_R(m)$ . If $R$ is a PID, then $\Ann_R(m)$ is a principal ideal as well, and we get an isomorphism between quotients by principal ideals $R/(f)$ and cyclic modules $Rm$ . The important part is: $R/(f)$ is either the free module $R$ (when $f=0$ ) or a torsion module $R/(f)$ (when $f\ne 0$ ).

Recall the structure theorem of modules over a PID. Now that we know more about cyclic modules, we know that any module $M$ defined over a PID $R$ is isomorphic to a direct sum of free submodules $R/(0)\iso R$ , and torsion cyclic submodules $R/(f_i)$ , which are ordered by divisibility: $f_1\mid\ldots\mid f_r$ for some order of the $f_i$ .

$M\iso\underbrace{R^n}_{\text{free}}\oplus\underbrace{R/(f_1)\oplus R/(f_2)\oplus\ldots}_{\text{torsion}}$

We can go a bit further:
Theorem: Over a PID $R$ , every cyclic module $Rm$ is a direct sum of cyclic modules, each of which is either a free cyclic module $R/(0)\iso R$ , or a torsion cyclic module $R/(p^k)$ where $p$ is irreducible in $R$ and $k>0$ .

Since $R$ is a PID, $\Ann_R(m)$ (being an ideal) is a principal ideal $(a)$ .

Recall that $Rm\iso R/\Ann_R(m)$ .

If $a=0$ then $Rm\iso R/(0)\iso R$ , meaning $R/(m)$ is a free cyclic module.

Otherwise, if $a$ is nonzero, note that $R$ is a PID and therefore a UFD. Then every nonzero $a$ has a unique factorization into primes $p_1^{k^1}p_2^{k^2}\ldots p_n^{k^n}$ .

So we have $R/(m)\iso R/\Ann_R(m)\iso R/(p_1^{k^1}p_2^{k^2}\ldots p_n^{k^n})$ which factors by the Chinese Remainder Theorem into $R/(p_1^{k^1})\oplus R/(p_2^{k^2})\oplus\ldots\oplus R/(p_n^{k^n})$ .

Therefore $Rm$ is either a free cyclic module, or factors into torsion cyclic modules where the annihilator is a prime power $p^k$ .
Thus another way to decompose a finitely generated module over a PID is $M\iso\underbrace{R^n}_{\text{free}}\oplus\underbrace{R/(p_1^{k_1})\oplus R/(p_2^{k_2})\oplus\ldots}_{\text{torsion}}$ where these prime power factors $p_i^{k_i}$ are known as elementary divisors.

Thus there are two ways to decompose a module according to the structure theorem:
- A hierarchial decomposition: $M\iso\underbrace{R^n}_{\text{free}}\oplus\underbrace{R/(f_1)\oplus R/(f_2)\oplus\ldots}_{\text{torsion}}$ where the invariant factors $f_i$ are ordered by divisibility: $f_1\mid\ldots\mid f_r$ , and are obtained by the Smith normal form.
- Into irreducibles: $M\iso\underbrace{R^n}_{\text{free}}\oplus\underbrace{R/(p_1^{k_1})\oplus R/(p_2^{k_2})\oplus\ldots}_{\text{torsion}}$ where the elementary divisors $p_i^{k_i}$ are powers of irreducibles $p_i$ in the ring $R$ , obtained by factoring each $R/(f_i)=R/(p_1^{k^1}p_2^{k^2}\ldots p_n^{k^n})$ via the Chinese Remainder Theorem.
The product of invariant factors must be equal to the product of the elementary divisors, since these describe the same module.

In this exploration, we discovered how $R$ -matrices can be used to fully describe all the homomorphisms between free $R$ -modules. As a bonus, we also learned how presentation matrices can be used to fully describe any $R$ -module, and further, broke down the structure of finitely generated modules over a PID.

Next time we’ll be applying this structure theorem to characterize completely all the homomorphisms between non-free modules.

January 2, 2024. Exploration 2: Module actions
Questions:
- How can we represent algebraic objects as actions?
Recall that $R$ -matrices are homomorphisms between free modules $R^n$ . How do we visualize homomorphisms between non-free modules $M$ ?

Let’s first focus on endomorphisms. Every endomorphism $T:M\to M$ must satisfy the homomorphism laws for $R$ -modules $M$ :
- $T(m+n)=T(m)+T(n)$ for all $m,n\in M$
- $T(km)=kT(m)$ for all $k\in R,m\in M$
In other words, $T$ is $R$ -linear.

$R$ -linearity allows us to define module actions. Here, since $T$ is $R$ -linear, $T$ respects addition and scalar multiplication and therefore can be described as acting on $M$ . We can express this by extending the $R$ -module $M$ to a $R[t]$ -module, which is an $R$ -module together with an action represented by the indeterminate $t$ . When the action is scalar multiplication, a $R[t]$ -module is just a module over the polynomial ring $R[t]$ .

Instead of scalar multiplication by $t$ , define $t$ to be an application of $T$ , and define the action of constant polynomials $r\in R$ to be scalar multiplication by $r$ . Thus each endomorphism $T:M\to M$ is associated with an $R[t]$ -module $M'$ whose action is to apply some linear combination of $T$ , represented by a polynomial in $R[t]$ .

By this construction, every endomorphism is associated with a unique $R[t]$ -module, and therefore studying these $R[t]$ -modules is enough to classify the endomorphisms.

In this section, we explore properties of $R[t]$ -modules when $R[t]$ is a PID.

When $R[t]$ is a PID, it’s equivalent to saying that $R$ is a field $F$ . So we’ll refer to $R[t]$ as $F[t]$ in this section.

Importantly, it turns out $F[t]$ -modules are torsion, i.e. it has a nontrivial annihilator $\Ann_{F[t]}(M)$ .
Theorem: Given a $F[t]$ -module $M$ where $t$ acts as an endomorphism $M\to M$ , there is some polynomial $f\in F[t]$ whose corresponding endomorphism $\in\End(M)$ is the zero map. In other words, $\Ann_{F[t]}(M)$ is nontrivial.

Since $t$ acts as $T\in\End(M)$ , we can construct a ring homomorphism $\varphi:F[t]\to\End(M)$ that maps linear combinations of $t$ ( $\sum_i a_it^i$ ) to their corresponding endomorphism ( $\sum_i a_iT^i$ ).

The kernel of $\varphi$ represents all polynomials $\in F[t]$ that map to the zero map in $\End(M)$ . Thus $\Ann_{F[t]}(M)=\ker\varphi$ , and it is enough to show that $\varphi$ has a non-trivial kernel.

Given that $\End(M)$ is finitely generated, and that polynomial rings $F[t]$ are always infinitely generated (by $\{1,t,t^2,\ldots\}$ , $\varphi$ is a mapping from an infinitely generated set to a finitely generated set, thus cannot be injective.

Since injectivity corresponds with having a trivial kernel, $\varphi$ has a non-trivial kernel.
Corollary: Every finitely generated $F[t]$ -module $M$ (where $t$ acts as an endomorphism $M\to M$ ) is torsion.

By definition, a torsion module is one with a non-trivial annihilator, which we proved true earlier.

Note that this result can be obtained via the Chinese remainder theorem which says $F[t]/(f\cdot g)\iso F[t]/(f)\oplus F[t]/(g)$ when $f,g$ coprime. Observe that two different linear factors $t-\alpha$ will always be coprime in $F[t]$ , thus every $f\in F[t]$ can be split completely into linear factors and

Theorem: Torsion $R$ -modules $M$ have no nontrivial free submodules.

If $M$ is torsion, then it has a nonzero element $a\in R$ such that $am=0$ for all $m\in M$ . Thus all elements of $M$ can be zeroed by $a$ , and this doesn’t change when you take any submodule of $M$ .

Because we’re working over a PID, we can apply the structure theorem to decompose such a $F[t]$ -module $M$ into a direct sum of free submodules $\iso F[t]$ and cyclic torsion submodules $\iso F[t]/(f^k)$ (for an irreducible polynomial $f$ ).

$M\iso\underbrace{\left(F[t]\right)^n}_{\text{free}}\oplus\underbrace{F[t]/(f_1^{k_1})\oplus F[t]/(f_2^{k_2})\oplus\ldots\oplus F[t]/(f_n^{k_n})}_{\text{torsion}}$

But $M$ , being torsion, has no nontrivial free submodules, so there is no free part:

$M\iso F[t]/(f_1^{k_1})\oplus F[t]/(f_2^{k_2})\oplus\ldots\oplus F[t]/(f_n^{k_n})$

In summary, we’ve reduced the problem of classifying endomorphisms to studying the torsion cyclic $F[t]$ -submodules $F[t]/(f^k)$ of their corresponding $F[t]$ -module.

In this section, we derive the notion of eigenvalues in module theory.

Given a $F$ -vector space $V$ , consider the endomorphism $T:V\to V$ . As before, define a $F[t]$ -module $V_T$ as $V$ where the action of $t$ is to apply $T$ . Since $V_T$ is a torsion module over a PID, apply the structure theorem to break it into a direct sum of torsion cyclic modules $\bigoplus_i F[t]/(f_i)$ where $f_1\mid f_2\mid\ldots\mid f_n$ . Note that each $f_i$ generates the annihilator for their corresponding cyclic module $F[t]/(f_i)$ . In this context where the module $M$ is defined over a polynomial ring $F[t]$ , we say that $f_i$ is the minimal polynomial for $M$ if it is the lowest degree polynomial that annihilates $M$ .

Theorem: $f$ is the minimal polynomial for a cyclic module $F[t]/(f)$ .

Every element in the annihilator of $F[t]/(f)$ annihilates $F[t]/(f)$ by definition. Since $f$ is the generator of the annihilator of $F[t]/(f)$ , it must divide every element of this annihilator, and therefore is the least-degree element in the annihilator. Thus $f$ is the lowest degree polynomial that annihilates $F[t]/(f)$ .

Theorem: Every $F[t]$ -module $M$ (where $t$ acts as an endomorphism $M\to M$ ) has a minimal polynomial $f\in F[t]$ that generates the annihilator of $M$ .

We know that the annihilator of $M$ is the kernel of the homomorphism $\varphi:F[t]\to\End(M)$ . Since the kernel is an ideal of $F[t]$ , a PID, the kernel must be generated by a single element $f\in F[t]$ . Since $F$ is a field, we can choose $f$ to be monic, since if it’s not monic we can multiply $f$ by the inverse of its leading coefficient to get a monic $f$ . This makes $f$ the lowest degree monic polynomial in the annihilator of $M$ .

In $R$ -module $M$ , an eigenvalue $\lambda$ of an $R$ -module endomorphism $T:M\to M$ is one where there exists a nonzero element $m\in M$ such that $Tm=\lambda m$ .

An endomorphism $T:M\to M$ is nilpotent if some power of $T$ is equal to the zero map.
Lemma: An $R$ -module endomorphism $T:M\to M$ has an eigenvalue $\lambda$ if $T-\lambda I$ is nilpotent.

If $T-\lambda I$ is nilpotent, then $(T-\lambda I)^k=0$ for some $k\ge 1$ .

If $k=1$ , then we have $T-\lambda I=0$ implying $T=\lambda I$ , which directly shows that $\lambda$ is an eigenvalue.

Otherwise if $k\ge 2$ , being nilpotent means $(T-\lambda I)^{k-1}\ne 0$ . So there is some element $m\in M$ such that $n=(T-\lambda I)^{k-1}m\ne 0$ . But then we have $(T-\lambda I)n=(T-\lambda I)\left[(T-\lambda I)^{k-1}m\right]=(T-\lambda I)^km=0$ Since there exists an element $n\in M$ where $(T-\lambda I)n=0$ , this proves $Tn=\lambda n$ thus $\lambda$ is an eigenvalue of $T$ .
A counterexample for the converse is the identity $I$ , which has eigenvalue $\lambda=1$ but is not nilpotent.

Lemma: The eigenvalues of an endomorphism of a direct sum $T:M\oplus N\to M\oplus N$ are exactly the eigenvalues of $T_M$ and $T_N$ combined (where $T_M$ denotes $T$ restricted to $M$ .)

It is enough to show that the eigenvalues of $T_M$ are eigenvalues of $T$ , since the $N$ case has an identical proof. If $T_M$ has an eigenvalue $\lambda$ , then $T_Mm=\lambda m$ for some $m\in M$ . Thus in $M\oplus N$ , we have $T(m,0)=(T_Mm,T_N0)=(\lambda m,0)=\lambda(m,0)$ proving that the eigenvalue $\lambda$ of $M$ is an eigenvalue of $M\oplus N$ .
Theorem: For a torsion component $F[t]/(f)$ of a $F[t]$ -module $M$ , the roots $\lambda$ of its minimal polynomial $f$ correspond to eigenvalues of $T$ (the action of $t$ ).

If $\lambda$ is a root of $f$ , then by the factor theorem, $(t-\lambda)$ is an irreducible factor of $f$ . Note that in a UFD like $F[t]$ , irreducibles and primes are the same thing.

Since the annihilator is generated by a prime power, and we know that the prime is $(t-\lambda)$ , the annihilator is exactly $(t-\lambda)^k$ for some $k\ge 1$ . So we’re working with $F[t]/(t-\lambda)^k$ .

Recall that polynomials in $F[t]$ act as endomorphisms on $M$ . The endomorphism corresponding to this polynomial $(t-\lambda)^k$ should look like $(T-\lambda I)^k$ , since we defined the action of $F[t]$ to treat constant polynomials like $\lambda$ as scalar multiplication.

Since $\lambda$ is a root, we have $(T-\lambda I)^k=0$ — that is, $T-\lambda I$ is nilpotent on the given component $F[t]/(t-\lambda)^k$ , implying that $\lambda$ is an eigenvalue of $T$ when restricted to the component $F[t]/(t-\lambda)^k$ .

But since $M$ is a direct sum including $F[t]/(t-\lambda)^k$ , and therefore $\lambda$ is also an eigenvalue of $T$ .
Theorem: The minimal polynomial for a $F[t]$ -module $M$ is exactly the largest invariant factor $f_n$ of $M$ .

If the decomposition is $M\iso\bigoplus_iF[t]/(f_i)$ , then since the roots of each $f_i$ are eigenvalues of $M$ , the minimal polynomial must contain a factor $(t-\lambda)$ for each eigenvalue $\lambda$ for each component. This means the minimal polynomial is the LCM of all $f_i$ . Since the decomposition has $f_1\mid f_2\mid\ldots\mid f_n$ , we can observe that $f_n$ is a multiple of every $f_i$ , making $f_n$ the minimal polynomial of $M$ .

Since the roots of $f$ for each $F[t]/(f)$ correspond to the eigenvalues of $T:M\to M$ (the action of $t$ ), it is useful to encapsulate all of the eigenvalues (including repeats) as the characteristic polynomial $\chi_T\in F[t]$ : a polynomial with a root $\lambda$ every time $\lambda$ appears as an eigenvalue of $T$ . Therefore:

Theorem: The characteristic polynomial for a $F[t]$ -module $M$ is exactly the product of all invariant factors $f_i$ of $M$ .

This follows directly from knowing that the roots of each $f_i$ are eigenvalues of $M$ . Then the product of each $f_i$ captures all eigenvalues of $M$ .

Corollary: The characteristic polynomial of an endomorphism $T:M\to M$ for a finitely generated $F[t]$ -module $M$ is the product of the minimal polynomials $f_i$ of its cyclic components.

There is another way to compute this characteristic polynomial:
Theorem: Over a finitely generated $F[t]$ -module $M$ , the characteristic polynomial of an endomorphism $T:M\to M$ is exactly $\det(T-tI)$ .

Since $M$ is finitely generated, we can construct its presentation matrix $A$ so that $M\iso F[t]^n/\im A$ . The fact that $M$ is equivalent to a quotient of a free module $F[t]^n$ means all endomorphisms of $M$ (e.g. $T$ ) can be expressed as a $F[t]$ -matrix modulo the quotient $\im A$ .

Now consider the equation $Tm=tm$ , where we treat $t\in F[t]$ as a scalar. If this is true for some nonzero $m\in M$ , then $t$ is an eigenvalue of $T$ by definition.

Rearranging the above to $(T-tI)m=0$ reduces the problem of finding eigenvalues of $T$ to finding values of $t$ such that $T-tI$ zeroes some nonzero $m\in M$ .

Recall that we can define a Smith normal form $PDQ^{-1}$ for every matrix over a Bézout domain $F[t]$ , such as $T-tI$ . This results in a diagonal matrix $D=\left[\begin{matrix}f_1&&&\\&f_2&&\\&&\ddots&\\&&&f_n\end{matrix}\right]$ where $f_i$ are the invariant factors of $T-tI$ (which may differ from the invariant factors of $M$ .)

Thus our equation becomes $(PDQ^{-1})m=0$ , which we can simplify to $D(Q^{-1}m)=0$ . Since $Q$ is invertible, $Q^{-1}m$ is also an arbitrary nonzero element of $M$ , so WLOG we may write $Dm=0$ . Because $D$ is diagonal, we may rewrite this as a system of equations: $\begin{aligned} f_1m_1&=0\\ f_2m_2&=0\\ \vdots\\ f_nm_n&=0 \end{aligned}$ where the $m_i$ are the components of $m$ . Note that a nonzero $m$ only requires at least one of the $m_i$ to be nonzero. That means we can assume $m_i=0$ for all but one of the equations.

When $m_i$ is nonzero, then for the equation $f_im_i=0$ to be true, $f_i$ must be equal to zero. This is because $F[t]$ , being an integral domain, has no zero divisors. In summary, if we can find values $\lambda$ for $t$ that make any $f_i$ zero (i.e. the roots of $f_i$ ), then $Dm=0$ is true for some nonzero $m\in M$ , so $(T-\lambda I)m=0$ is true. So the roots of $f_i$ are eigenvalues of $T$ .

This means one can obtain every eigenvalue of $T$ by finding the roots of the product $\prod_i f_i$ . So the characteristic polynomial $\chi_T$ is $\prod_i f_i$ , which is exactly the determinant of $T-tI$ .
The companion matrix of a polynomial $f\in F[t]$ is a matrix constructed so that its characteristic polynomial is exactly $f$ . Given a polynomial $t^n+a_{n-1}t^{n-1}+\ldots+a_1t+a_0$ , its companion matrix is $t^n+a_{n-1}t^{n-1}+\ldots+a_1t+a_0\iff\left[\begin{matrix} 0&0&\cdots&0&-a_0\\ 1&0&\cdots&0&-a_1\\ 0&1&\cdots&0&-a_2\\ \vdots&\vdots&\ddots&0&\vdots\\ 0&0&\cdots&1&-a_{n-1} \end{matrix}\right]$
Corollary: All $F$ -matrices $M$ have at least one eigenvalue iff $F$ is an algebraically closed field.

For the forward direction: since the companion matrix for every polynomial $\in F[x]$ (of degree $\ge 1$ ) has an eigenvalue, every polynomial $\in F[x]$ (of degree $\ge 1$ ) has a root in $F$ , therefore $F$ is algebraically closed.

The backward direction is because in an algebraically closed field, all degree $\ge 1$ polynomials in $F[t]$ have a root in $F$ , including $\det(M-tI)$ , whose roots correspond to eigenvalues of $M$ .
Theorem: Given a $F[t]$ -module $M$ where $t$ acts as an endomorphism $M\to M$ , $M$ is cyclic iff the minimal and characteristic polynomials coincide.

Let $M\iso\bigoplus_iF[t]/(f_i)$ by the structure theorem. Since the minimal polynomial is exactly $f_n$ and the characteristic polynomial is exactly $\prod_i f_i$ , they can only coincide when there is only one invariant factor. But that indicates that the decomposition includes only one cyclic submodule $F[t]/(f_i)$ , meaning the original module $M$ must be cyclic as well.

The proof in the other direction is trivial: if $M$ is cyclic $M\iso F[t]/(f)$ , then both $f_n$ and $\prod_i f_i$ are equal to $f$ .
If a matrix $T$ has eigenvalue $\lambda$ , then the kernel of $T-\lambda I$ is its corresponding eigenspace. If the image of $T$ is equal to one of its eigenspaces, the map is a multiplication by a scalar $\lambda$ .

In this section, we display the module action of polynomial rings.

We can express such a linear transformation as the action of a polynomial ring $F[x]$ on the $F$ -module.

Let’s explore the $F[t]$ -modules, where $F[t]$ is a polynomial ring defined over a field $F$ .

For instance, take the $F[t]$ -module $F[t]/(t-2)^3$ . Since it’s quotiented by a degree $3$ polynomial $(t-2)^3=t^3-6t^2+12t-8$ , the obvious basis is $\{1,t,t^2\}$ . We use the fact that $0=t^3-6t^2+12t-8$ in the quotient to get the constraint $T(t^2)=t^3=6t^2-12t+8$ . This means the $F[t]$ -matrix $T$ (with respect to the basis $\{1,t,t^2\}$ ) is one where $T(t^2)=6t^2-12t+8$ , i.e. we have $T\left[\begin{matrix}1\\0\\0\end{matrix}\right] =\left[\begin{matrix}0\\1\\0\end{matrix}\right],\quad T\left[\begin{matrix}0\\1\\0\end{matrix}\right] =\left[\begin{matrix}0\\0\\1\end{matrix}\right],\quad T\left[\begin{matrix}0\\0\\1\end{matrix}\right] =\left[\begin{matrix}8\\-12\\6\end{matrix}\right]$ which corresponds to the presentation matrix $\left[\begin{matrix}0&0&8\\1&0&-12\\0&1&6\end{matrix}\right]$

So we have found $T$ for the $F[t]$ -module $F[t]/(t-2)^3$ .

From the structure theorem, we know that direct sums of $F[t]$ -modules describe all $F[t]$ -modules. Let’s do this again where $M$ is a direct sum.

Take the $F[t]$ -module $M\iso F[t]/(t-2)\oplus F[t]/(t-2)^2$ . The obvious basis is $\{(1,0),(0,1),(0,t)\}$ . We do the same thing as before, realizing $0=t-2$ on the left and $0=(t-2)^2=t^2-4t+4$ on the right, and obtaining the constraints $T(1)=t=2$ on the left and $T(t)=t^2=4t-4$ on the right. We have: $T\left[\begin{matrix}1\\~\\~\end{matrix}\right] =\left[\begin{matrix}2\\~\\~\end{matrix}\right],\quad T\left[\begin{matrix}~\\1\\0\end{matrix}\right] =\left[\begin{matrix}~\\0\\1\end{matrix}\right],\quad T\left[\begin{matrix}~\\0\\1\end{matrix}\right] =\left[\begin{matrix}~\\-4\\4\end{matrix}\right]$ corresponding to the presentation matrix $\left[\begin{matrix}2&&\\&0&-4\\&1&4\end{matrix}\right] =\left[\begin{matrix}2\end{matrix}\right] \oplus\left[\begin{matrix}0&-4\\1&4\end{matrix}\right]$ where blank entries are zeroes.

So we’ve described how to find a presentation matrix for any $F[t]$ -module. We could reduce this matrix using the operations we can do on presentation matrices, but there is a kind of standard form for a presentation matrix in the case of $F[t]$ -presentation matrices.

The Chinese remainder theorem says $F[t]/(f\cdot g)\iso F[t]/(f)\oplus F[t]/(g)$ when $f,g$ coprime. Consider $F[t]/(f)$ . Since distinct linear factors are always coprime, we can always split $f$ to get a bunch of powers of linear factors, one for each repeated root. For instance, $F[t]/(t-\alpha)^2\oplus F[t]/(t-\alpha)^3$ .

For an easy example, consider the $F[t]$ -module $M\iso F[t]/(t-\alpha)^4$ . To find $T$ , first find some basis for $M$ (as a vector space). A clever basis is $\{1,(t-\alpha),(t-\alpha)^2,(t-\alpha)^3\}$ . This conveniently gives us the constraints $\begin{aligned} T(1)&=t(1)&&=(t-\alpha)^4+\alpha(t-\alpha)^3=\alpha(t-\alpha)^3\\ T(t-\alpha)&=t(t-\alpha)&&=(t-\alpha)^3+\alpha(t-\alpha)^2\\ T(t-\alpha)^2&=t(t-\alpha)^2&&=(t-\alpha)^2+\alpha(t-\alpha)\\ T(t-\alpha)^3&=t(t-\alpha)^3&&=(t-\alpha)+\alpha(1)\\ \end{aligned}$ Converting this into terms of basis vectors, this is $T\left[\begin{matrix}1\\0\\0\\0\end{matrix}\right] =\left[\begin{matrix}\alpha\\0\\0\\0\end{matrix}\right],\quad T\left[\begin{matrix}0\\1\\0\\0\end{matrix}\right] =\left[\begin{matrix}1\\\alpha\\0\\0\end{matrix}\right],\quad T\left[\begin{matrix}0\\0\\1\\0\end{matrix}\right] =\left[\begin{matrix}0\\1\\\alpha\\0\end{matrix}\right],\quad T\left[\begin{matrix}0\\0\\0\\1\end{matrix}\right] =\left[\begin{matrix}0\\0\\1\\\alpha\end{matrix}\right]$ which corresponds to the presentation matrix $\left[\begin{matrix}\alpha&1&&\\&\alpha&1&\\&&\alpha&1\\&&&\alpha\end{matrix}\right]$

Such a matrix is known as a Jordan matrix with eigenvalue $\alpha$ with multiplicity $4$ . It has $\alpha$ along the diagonal and ones on the superdiagonal, and this is precisely because we chose the basis in such a way that the coefficients in each constraint are $1$ and $\alpha$ . Next, here’s a direct sum example. Consider the $F[t]$ -module $M\iso F[t]/(t-\alpha)^4\oplus F[t]/(t-\alpha)^3\oplus F[t]/(t-\beta)^3$ . We’ll end up with the following presentation matrix, which is composed of matrices that are much alike the previous example: $\left[\begin{matrix}\alpha&1&&\\&\alpha&1&\\&&\alpha&1\\&&&\alpha\end{matrix}\right] \oplus\left[\begin{matrix}\alpha&1&\\&\alpha&1\\&&\alpha\end{matrix}\right] \oplus\left[\begin{matrix}\beta&1&\\&\beta&1\\&&\beta\end{matrix}\right]$

This is a direct sum of Jordan matrices! We call this the Jordan normal form of the presentation matrix, which always exists for $R$ -matrices over a PID $R$ . This is the standard form for a presentation matrix for such $R$ -modules.

In general, given a $R$ -module over a PID $R$ , you may:
- Express the $R$ -module as a direct sum of cyclic modules $R/(f_i)$ where $f_i$ is the minimal polynomial of that cyclic component.
- Split each $R/(f_i)$ into a direct sum of powers of linear factors (like $R/(t-\alpha)^2\oplus R/(t-\beta)^3$ ) via factoring the $f_i$ , which gives you the eigenvalues and their multiplicities.
- Construct a direct sum of presentation matrices, which is in Jordan normal form.
All this to find the $F[t]$ -matrix $T$ that defines scalar multiplication by $t$ in a $F[t]$ -module.

January 6, 2024. Exploration 3: Representation theory
Questions:
- TODO
Last exploration, we studied the properties of individual $R$ -matrices. This time we’re studying $R$ -matrices collectively.

First, we define $\Hom_R(M,N)$ as the set of all $R$ -module homomorphisms $M\to N$ . When $M,N$ are free $R$ -modules, the elements of $\Hom_R(M,N)$ are $R$ -matrices. Generally this means $\Hom_R(M,N)$ is a noncommutative ring, given that matrix multiplication is not commutative in general.

Let’s start by representing groups using subsets of $\Hom$ .

For instance, we can express the group $\CC^\times$ (nonzero complex numbers under multiplication) as $2\times 2$ real matrices. We define a map $\rho:\CC^\times\to\Hom_\RR(\RR^2,\RR^2)$ :

$\rho(1)=\left[\begin{matrix}1&0\\0&1\end{matrix}\right]\quad \rho(i)=\left[\begin{matrix}0&1\\-1&0\end{matrix}\right]\quad \rho(a+bi)=\left[\begin{matrix}a&b\\-b&a\end{matrix}\right]$

so that $\begin{aligned} \rho((a+bi)(c+di)) &=\rho(a+bi)\rho(c+di)\\ &=\left[\begin{matrix}a&b\\-b&a\end{matrix}\right] \left[\begin{matrix}c&d\\-d&c\end{matrix}\right]\\ &=\left[\begin{matrix}ac-bd&ad+bc\\-ad-bc&ac-bd\end{matrix}\right]\\ &=\rho((ac-bd)+(ad+bc)i) \end{aligned}$ and $\begin{aligned} \rho((a+bi)^{-1}) &=(\rho(a+bi))^{-1}\\ &=\left[\begin{matrix}a&b\\-b&a\end{matrix}\right]^{-1}\\ &=\det\left(\left[\begin{matrix}a&b\\-b&a\end{matrix}\right]\right)\adj\left(\left[\begin{matrix}a&b\\-b&a\end{matrix}\right]\right)\\ &=\frac{1}{a^2+b^2}\left[\begin{matrix}a&-b\\b&a\end{matrix}\right]\\ &=\rho\left(\frac{a-bi}{a^2+b^2}\right) \end{aligned}$

Since it preserves product and inverse, these matrices ( homomorphisms) define a group, isomorphic to $\CC^\times$ . This is just one of many encodings of complex numbers into matrices. The key is that $i^2=-I$ , and that $1,i$ don’t interact with each other.

Here we found that $\Hom_\RR(\RR^2,\RR^2)$ was a suitable subgroup of $R$ -module homomorphisms. In general, to represent a group, we must select the subgroup of $\Hom_R(M,N)$ that consists of invertible $R$ -module homomorphisms. In practice, $M$ and $N$ are free modules $R^m$ and $R^n$ , and invertibility implies $m=n$ . Thus, when representing a group, we can choose among the invertible matrices in $\Hom_R(R^n,R^n)$ , also written as $\Aut(R^n)$ , the group of automorphisms on $R^n$ , whose elements we express as invertible $n\times n$ $R$ -matrices.

So we assign to each group element an invertible $n\times n$ $R$ -matrix. A matrix $R$ -representation of a group is a group homomorphism $G\to\Aut(R^n)$ .
Theorem: There is a one-to-one correspondence between $R$ -module automorphisms and matrix $R$ -representations of a finite group $G$ .

Every representation $G\to\Aut(R^n)$ assigns each $g\in G$ to a specific permutation matrix in $\Aut(R^n)$ .

But this implicitly defines an action of $G$ on $R^n$ . The action is given by matrix muliplication in $R^n$ , where each $g$ acts by multiplying by its corresponding matrix.

Conversely, by Cayley’s theorem every group is isomorphic to a permutation group, and therefore can act on $R^{|G|}$ by permuting the $|G|$ basis $R$ -vectors. But permutations are automorphisms, so this essentially assigns each element $g\in G$ an automorphism over $R^{|G|}$ , i.e. constructing a matrix $R$ -representation $G\to\Aut(R^{|G|})$ .
The representation $G\to\Aut(R^{|G|})$ (mentioned in the above proof) is known as the regular representation of a finite group $G$ . The main idea of this representation is to have $G$ permute the indices of $R^{|G|}$ according to the permutation group isomorphic to $G$ .

In this section, we show the module analog of group actions.

In the same manner that adjoining an indeterminate symbol $x$ to $R$ gives you a polynomial ring $R[x]$ , we can adjoin a whole group to $R$ to get a group ring $R[G]$ . Elements look like $R$ -linear combinations of elements of $G$ : $\sum_{g\in G}r_gg$

Theorem: Every group ring $R[G]$ is a free module.

Since every element in $R[G]$ is by definition uniquely represented as a linear combination of elements $g\in G$ , we know that $G$ spans $R[G]$ . Like in polynomial rings, the only linear combination equal to zero is the one where every coefficient is zero, so the elements of $G$ are linearly independent and therefore form a basis of $R[G]$ . Having a basis means the elements of $R[G]$ can be represented by $R$ -vectors of coefficients, and therefore the group ring $R[G]$ is a free module by construction.

Thus a group ring is simultaneously a group and a free $R$ -module.

Theorem: The group ring $R[G]$ is exactly the regular representation of $G$ over $R$ .

Since $R[G]$ is a free $R$ -module, it has a basis whose elements are precisely the elements of the group $G$ . It is a property of groups that any element $g\in G$ permutes the elements of $g$ by left-multiplication. Permuting the elements is an automorphism on $G$ and therefore an automorphism on $R^|G|$ , thus $R[G]$ is exactly the regular representation $G\to\Aut(R^{|G|})$ .

Therefore: Every group action of $G$ on an $R$ -module $M$ can be encoded by an appropriate group ring $R[G]$ .

A representation $G\to\Aut(M)$ assigns an automorphism on the $R$ -module $M$ to every element $g\in G$ .

Since the automorphisms on $R$ -modules are exactly the linear transformations by definition, every $g\in G$ is assigned a linear transformation on $G$ . In other words, the action of $g$ on $M$ must be linear in $M$ .

But $R[G]$ is exactly every linear combinations of $G$ with respect to $R$ , and therefore contains every possible action of $G$ on $M$ .

This means that every action defined for $G$ on some module $M$ can be linearly extended to an action of $R[G]$ on $M$ . Specifically, if the group action $g\cdot m$ is defined for all $g\in G, m\in M$ , then we define $(\sum_{g\in G}r_gg)\cdot m$ as $\sum_{g\in G}r_g(g\cdot m)$ . Such a module $M$ is known as a $R[G]$ -module, a module over a group ring.

Theorem: Every group ring $R[G]$ is an $R[G]$ -module.

$R[G]$ is a free $R$ -module, and $G$ has a natural group action on $R[G]$ by left-multiplication in the ring. Thus it is a $R[G]$ -module.

To preserve the $R[G]$ -module structure, $R[G]$ -submodules of $R[G]$ -modules must be $G$ -invariant: just as how $R[G]$ -modules are closed under the action of $R[G]$ , $R[G]$ -submodules $W$ must be closed under the action of $R[G]$ in the sense that for all $g\in R[G],w\in W$ , $gw\in W$ .

Similarly, homomorphisms $\sigma:M\to N$ between $R[G]$ -modules $M,N$ must be $G$ -equivariant: they must commute with the action of $R[G]$ in the sense that for all $g\in R[G],m\in M$ , $g\sigma(m)=\sigma(gm)$ .

Theorem: Given finite $G$ , we can always construct a $G$ -equivariant version $\tilde{\sigma}:M\to M$ of any endomorphism $\sigma:M\to M$ of the $R[G]$ -module $M$ .

The trick to get $G$ -equivariance is to take a sum over all actions of $G$ : $\tilde{\sigma}(m)=\sum_{g\in G}g^{-1}\sigma(gm)$ which we can do because $G$ is finite. Then we can show that for $h\in G$ : $\begin{aligned} h\tilde{\sigma}(m)&=h\sum_{g\in G}g^{-1}\sigma(gm)\\ &=\sum_{g\in G}hg^{-1}\sigma(gm)\\ &=\sum_{g\in G}h(gh)^{-1}\sigma((gh)m)&\text{ since }g\mapsto gh\text{ simply reorders the sum over }G\\ &=\sum_{g\in G}g^{-1}\sigma(g(hm))\\ &=\tilde{\sigma}(hm) \end{aligned}$ and thus $\tilde{\sigma}$ is $G$ -equivariant.
Theorem: If $G$ is finite and $|G|$ is a unit in $R$ and $M$ is a free $R[G]$ -module, then every $R[G]$ -submodule $W$ of $M$ implicitly defines a $G$ -equivariant projection $M\to W$ .

Since $M$ is free, it has a basis, and so does the submodule $W$ . We can obtain a projection $\pi:M\to W$ by mapping the basis vectors not in $W$ to basis vectors in $W$ .

Recall that endomorphisms like projections can be made $G$ -equivariant via an averaging trick. To ensure that the resulting endomorphism $\tilde{\pi}:M\to M$ is still a projection, we need to ensure that it is idempotent and that its image is $W$ . It is enough to construct $\tilde{\pi}$ as a map that fixes elements $w\in W$ (thus ensuring idempotence and that the image is at least $W$ ) and maps other elements in $M$ to an element in $W$ (thus ensuring that the image is at most $W$ ).

Given that $|G|$ is a unit in $R$ , the map obtained by the averaging trick can be modified to ensure the above properties: $\tilde{\pi}(m)=\frac{1}{|G|}\sum_{g\in G}g^{-1}\sigma(gm)$

This $\tilde{\pi}$ fixes $w\in W$ : $\begin{aligned} \tilde{\pi}(w)&=\frac{1}{|G|}\sum_{g\in G}g^{-1}\pi(gw)\\ &=\frac{1}{|G|}\sum_{g\in G}g^{-1}\pi(w')&\text{ since }W\text{ is }G\text{-invariant}\\ &=\frac{1}{|G|}\sum_{g\in G}g^{-1}w'&\text{ since }\pi\text{ is a projection onto }W\\ &=\frac{1}{|G|}\sum_{g\in G}w&\text{ since }w'=gw\\ &=\frac{1}{|G|}|G|w\\ &=w \end{aligned}$ and maps all $m\in M$ to an element in $W$ , since $\begin{aligned} &\pi(gm)\in W&\text{ by definition of }\pi\\ \implies&g^{-1}\pi(gm)\in W&\text{ since $W$ is $G$-invariant}\\ \implies&\frac{1}{|G|}\sum_{g\in G}g^{-1}\sigma(gm)\in W&\text{ by linearity of the group action}\\ \implies&\tilde{\pi}(m)\in W \end{aligned}$

Thus $\tilde{\pi}$ as defined is a $G$ -equivariant projection onto $W$ .
In this section, we consider representations as algebraic structures in their own right.

Note that for our representation for $\CC^\times$ in the beginning, each element of $G$ is assigned a distinct element of $\Aut(\RR^n)$ , i.e. the representation is injective. When the representation is injective (i.e. the domain is isomorphic to its image), we call it a faithful representation, and they are the typical ones where each element of $G$ is represented by a distinct matrix in $\Aut(R^n)$ .

There are also non-faithful representations. An example of a very not faithful representation is the trivial representation, which represents every element as the zero element in the zero module. Another is the sign representation: $S_n\to\Aut(\ZZ)$ , which simply takes the sign ( $1$ or $-1$ ) of every permutation in $S_n$ . Another example is $\det\circ\rho$ , where $\rho$ is some matrix representation. In general, a representation is a group homomorphism $G\to\Aut(M)$ (where $M$ is some $R$ -module).

Theorem: A $R[G]$ -module $M$ fully describes a representation $G\to\Aut(M)$ .

This result comes naturally from the fact that a group ring $R[G]$ encodes all possible linear group actions and therefore its action on a module $M$ models every possible automorphism on the $R$ -modules $M$ , since automorphisms on $R$ -modules are linear by definition.

Theorem: A faithful representation $\rho:G\to\Aut(M)$ is one where no $g\in G$ acts as the identity action on $V$ except for the identity element $e\in G$ .

A faithful representation is injective, i.e. its domain $G$ is isomorphic to its image $\Aut(M)$ . This means that only one element $g\in G$ maps to the identity automorphism, so only one element $g\in G$ acts as identity on $M$ .

Theorem: A faithful representation $\rho$ , when represented as a $R[G]$ -module $M$ , is one where each $x\in R[G]$ has a unique action on $M$ .

If two elements $x,y\in R[G]$ have the same action on $M$ , then $x-y$ must be the zero action (sending all elements of $M$ to zero). But $x-y=0$ implies $x=y$ .

The regular representation $\rho:G\to\Aut(R^{|G|})$ (or equivalently, $\rho=R[G]$ ) represents elements of $G$ as automorphisms on the free $R$ -module $R^{|G|}$ (whose elements are $R$ -vectors). It is faithful, since it is composed of distinct permutations of $G$ , so there is always a faithful matrix representation of any group $G$ . Can we do better? Ideally, we want to represent elements of a group $G$ using $R$ -vectors that are perhaps smaller than $|G|$ , without affecting the faithfulness of $\rho$ .

By using the above fact that any representation $\rho$ is equivalent to some $R[G]$ -module $M$ , we can take subrepresentations of $\rho$ as $R[G]$ -submodules of $M$ . To begin, given a representation $\rho:G\to\Aut(M)$ , we can think about quotienting the corresponding $R[G]$ -module $M$ by one of its $R[G]$ -submodules. However, quotienting might create a representation that isn’t faithful. When does quotienting the underlying module $M$ of a faithful representation $\rho:G\to\Aut(M)$ preserve faithfulness?

Therefore: Faithful representations are exactly those where the group action is injective.

Recall that preserving the group action means that every element of the group is identified with a distinct action. In particular, that means there is only one element that behaves like the identity action: the identity element $e\in G$ . But if the representation $\rho:G\to\Aut(R^n)$ maps only $e$ to the identity in $\Aut(R^n)$ , then it is injective, i.e. faithful.

Therefore, quotienting $M$ preserves faithfulness of $\rho$ exactly when the quotient $M/W$ preserves the group action on $M$ . In other words, the submodule $W$ must be $G$ -invariant: it must remain unchanged under the group action of $G$ . But since $R[G]$ -submodules are $G$ -invariant by definition, quotienting by an $R[G]$ -submodule always preserves faithfulness.

As it turns out, by quotienting repeatedly by different $R[G]$ -submodules, we can “factor” $M$ into a direct sum of submodules known as irreducible representations, or irreps.
Theorem: For a finite group $G$ , if $W$ is a $G$ -invariant submodule of the $R[G]$ -module $M$ , then $M/W$ is another $G$ -invariant submodule with $M\iso M/W\oplus W$ , assuming $\char R\nmid |G|$ .

Recall that if a projection exists on $M$ , then $M$ is isomorphic to $\ker\sigma\oplus\im\sigma$ . So it is enough to define a projection $\sigma:M\to M$ where $\ker\sigma\iso M/W$ and $\im\sigma=W$ . Let’s see what those conditions imply.

To ensure that $\ker\sigma\iso M/W$ , we need to ensure that every element in $\ker\sigma$ differs by some element in $W$ .

The requirement $\im\sigma=W$ implies that $\sigma$ maps $M$ to the $G$ -invariant submodule $W$ of $M$ . Thus we need $\sigma$ to be $G$ -equivariant ( $g\sigma(m)=\sigma(gm)$ ) so that it preserves $G$ -invariance.

Finally, to be a projection, $\sigma$ must be idempotent.

We’ll start with the requirement that $\sigma$ is $G$ -equivariant. In order to get $G$ -equivariance, one trick is to take the sum of all products with $g$ : $\tilde{\sigma}(m)=\sum_{g\in G}gm$ Then we can show that for $h\in G$ : $\begin{aligned} h\tilde{\sigma}(m)&=h\sum_{g\in G}gm\\ &=\sum_{g\in G}hgm&\text{ by linearity in }R[G]\text{-modules}\\ &=\sum_{g\in G}gm&\text{ since }g\mapsto hg\text{ is an automorphism on }G\\ &=\tilde{\sigma}(hm) \end{aligned}$ where $\sum_{g\in G}hgm=\sum_{g\in G}gm$ because the act of multiplying every element of the sum by $h$ just permutes the order of the sum. Therefore, $\tilde{\sigma}$ commutes with the action of $G$ on $M$ , and is therefore $G$ -equivariant.

Next up is ensuring that the elements of $\ker\sigma$ differ by elements of $W$ . One way is to send every $gm$ to $W$ via some projection $\pi:M\to W$ . We redefine $\tilde{\sigma}$ : $\tilde{\sigma}(m)=\sum_{g\in G}\pi(gm)$ Then $\tilde{\sigma}$ is a linear combination of elements of $W$ . This means that $\im\tilde{\sigma}$ is a subset of $W$ , and we can show that two elements of $\ker\tilde{\sigma}$ differ by $\pi(gm)-\pi(gn)\in W$ : $\begin{aligned} \tilde{\sigma}(m)-\tilde{\sigma}(n)&=\sum_{g\in G}\pi(gm)-\sum_{g\in G}\pi(gn)\\ &=\sum_{g\in G}\pi(gm)-\pi(gn) \end{aligned}$

Finally, $\sigma$ must be a projection. Therefore, it must be idempotent, and every element of $W$ must be mapped to. Without losing the previous properties, we can have $\sigma(w)=w$ for every $w\in W$ by taking the inverse action of $G$ , and dividing by $|G|$ (which works since $\char R\nmid |G|$ ): $\sigma(m)=\frac{1}{|G|}\sum_{g\in G}g^{-1}\pi(gm)$ because $\begin{aligned} \sigma(w)&=\frac{1}{|G|}\sum_{g\in G}g^{-1}\pi(gw)\\ &=\frac{1}{|G|}\sum_{g\in G}g^{-1}\pi(w')&\text{ since }W\text{ is }G\text{-invariant}\\ &=\frac{1}{|G|}\sum_{g\in G}g^{-1}w'&\text{ since }\pi\text{ is a projection onto }W\\ &=\frac{1}{|G|}\sum_{g\in G}w&\text{ since }w'=gw\\ &=\frac{1}{|G|}|G|w\\ &=w \end{aligned}$ Therefore $\sigma(w)=w$ for all $w\in W$ , meaning $W\subseteq\im\sigma$ . By definition, $\sigma(m)$ remains a linear combination of elements of $W$ , and therefore $\im\sigma\subseteq W$ . Therefore $\im\sigma=W$ .

Finally, since $\sigma$ is an idempotent endomorphism, we know that $M$ decomposes into $\ker\sigma\oplus\im\sigma$ . Since $\ker\sigma\iso M/W$ and $\im\sigma=W$ by construction, we have $M\iso M/W\oplus W$ .
Maschke’s Theorem: Every representation $\rho$ of a finite group $G$ (over a field with characteristic not dividing $|G|$ ) is a direct sum of irreducible representations.

If the $R[G]$ -module $M$ corresponding to the representation $\rho$ doesn’t reduce into irreps, there is a proper $G$ -invariant submodule $W$ of $M$ .

By the above theorem, which we can use since $\char R\nmid |G|$ , there is some complementary $G$ -invariant submodule $W'$ such that $M\iso W\oplus W'$ .

By recursively decomposing proper $G$ -invariant submodules $W$ of each of the factors, you get smaller and smaller submodules. Since $G$ is finite, this process eventually stops when you are left with a direct sum of irreps that is isomorphic to the given representation $\rho$ .
A $R[G]$ module that can be written as a direct sum of simple modules is semisimple. Hence:

Corollary: A $R[G]$ -module is semisimple if $G$ is finite and $\char R\nmid |G|$ .

In this section, we try to find irreps of any matrix representation.

Now that we know by Maschke’s theorem that we can factor any finite group representation into irreps (assuming $\char R\nmid |G|$ ), let’s do so for matrix representations (which are always finite).

A matrix representation $G\to\Aut(R^n)$ is special compared to more general representations $G\to\Aut(M)$ because it implies that the module $M=R^n$ is a free module. This simplifies finding irreps considerably.

In this section, we find an easier way to discover the eigenvalues in an algebraically closed field.

Find the eigenvalues of a given $R$ -matrix, where $R$ is a PID.

I know that the characteristic equation is used to find eigenvalues in vector spaces, which are defined over fields. What happens if we move to the world of modules defined over integral domains?

February 27, 2024. Exploration 4: Character theory
Questions:
- TODO
There are two problems when it comes to working with matrix representations:
- You have to work with (possibly huge) matrices
- They implicitly require a basis, which seems arbitrary
Let’s start by defining what it means for two (general) representations $\rho:G\to\Aut(M)$ and $\rho':G\to\Aut(M')$ to be the same representation. Say that these two representations are equivalent if there is an isomorphism $\phi$ between $M$ and $M'$ that preserves the group action, i.e. $\phi\circ\rho(g)=\rho'(g)\circ\phi$ for all $g\in G$ , i.e. the following diagram commutes:

$\begin{aligned} M&\xrightarrow{\rho(g)}&M \\ \downarrow{\phi}&&\downarrow{\phi}&\quad\text{ where }\phi:M\to M'\text{ is an isomorphism}\\ M'&\xrightarrow{\rho'(g)}&M' \end{aligned}$

Let’s explore what happens in the case of matrix representations. When $M,M'$ are free modules $R^n$ , then $\rho,\rho'$ become matrix representations, and thus $Aut(R^n)$ is composed of $n\times n$ matrices. Then the problem of checking whether $M,M'$ are isomorphic comes down to comparing properties of the representative matrices.
Theorem: Conjugate elements $g,h$ in $G$ are represented by similar matrices $A,B$ in any matrix representation $\rho:G\to\Aut(R^n)$ .

$g,h$ being conjugate elements means $kgk^{-1}=h$ for some $k\in G$ . Applying $\rho$ gives $CAC^{-1}=B$ , which is the same as saying $A$ and $B$ are similar.
Corollary: Conjugacy classes in $G$ are represented by similarity classes in $\Aut(R^n)$ .

Corollary: Equivalent matrix representations differ only in which similarity class they assign to each conjugacy class.

In this section, we explore some facts about class functions.

Since representations of elements of $G$ in the same conjugacy class are similar matrices, we’d like to classify the similarity classes of matrices by defining a function on matrices that is unchanged under conjugation. So we’re interested in class functions on $G$ , functions $G\to R$ that are invariant under conjugation in $G$ .
Theorem: Given $R$ a commutative ring, the class functions $G\to R$ form an $R$ -module.

Most of this comes from $R$ itself. Given $f_1,f_2:G\to R$ are two arbitrary class functions on $G$ , we can show that the class functions form an additive abelian group:

Closure: $(f_1+f_2)(g)=f_1(g)+f_2(g)$

Identity: $0(g)=0$

Inverse: $(-f_1)(g)=-f_1(g)$

Commutativity: inherited from $R$

Associativity: inherited from $R$

To show that they form an $R$ -module, we have closure under scalar multiplication: $(rf)(g)=r\cdot f(g)$
Corollary: Given $G$ a finite group and $R$ a commutative ring, the class functions $G\to R$ form a free $R$ -module.

This is the same as the previous theorem, but now we need to prove that class functions on finite groups form a free $R$ -module. This requires showing a basis.

The standard basis (in the form of indicator functions) will do. Since $G$ is finite, index the conjugacy classes from $1$ to $n$ , and let $u_i$ be the class function whose value is $1$ on elements from the $i$ th class and $0$ otherwise.

Then the $u_i$ form a basis, meaning that the $R$ -module of class functions is free.
Ideally we can decompose every class function $G\to R$ into a matrix representation $G\to\Aut(R^n)$ followed by a linear class function $\Aut(R^n)\to R$ . This works out nicely because of the following theorem:
Theorem: For matrices $M$ over an integral domain $R$ , the trace $\tr(M)$ is the unique nontrivial linear class function, up to scalar multiplication.

Invariance under conjugation means $t(A)=t(BAB^{-1})$ . Due to linearity of $t$ , this is equivalent to $t(AB)=t(BA)$ .

We decompose the input matrix $M$ into its matrix units $E_{ij}$ , where $E_{ij}$ has a one on the $i$ th row and $j$ th column and zero elsewhere. Then $t(AB)=t(BA)$ implies $t(E_{ij}E_{kl})=t(E_{kl}E_{ij})$ .

Assume that both sides are nonzero, since otherwise the integral domain lets us factor out $t(I)=0$ which implies $t(\text{anything})=0$ by linearity, and we’re not interested in trivial $t$ .

Since $t(0)=0$ by linearity, $t(M)$ is only nonzero when $M$ is nonzero. Over an integral domain, this means $E_{ij}E_{kl}$ and $E_{kl}E_{ij}$ must be nonzero as well.

But $E_{ij}E_{kl}$ is only nonzero when $j=k$ , and $E_{kl}E_{ij}$ is only nonzero when $i=l$ . Then $E_{ij}E_{kl}=E_{il}=E_{ii}$ and $E_{kl}E_{ij}=E_{kj}=E_{jj}$ must be nonzero. Further, we have $t(E_{ii})=t(E_{jj})$ for all $i,j$ implying that all $t(E_{ii})$ is a constant $\lambda$ .

Therefore $t(E_{ij})=\lambda$ if $i=j$ and is zero otherwise. Then we can decompose the original matrix and apply linearity of $t$ to get: $\begin{aligned} t(M)&=t\left(\sum_{i,j} m_{ij}E_{ij}\right)\\ &=\sum_{i,j} m_{ij}t(E_{ij})&\text{ by linearity of }t\\ &=\sum_i m_{ii}t(E_{ii})&\text{ since }t(E_{ij})=0\text{ for }i\ne j\\ &=\lambda\sum_i m_{ii}&\text{ since }t(E_{ij})=\lambda\text{ for }i=j\\ \end{aligned}$

The trace of a matrix $\tr(M)$ is exactly $\sum_i m_{ii}$ , therefore $t(M)=\lambda\tr(M)$ , and so $t$ must be some scalar multiple of the trace operator.
That is, the only linear class function is the trace, and therefore we can decompose every class function $G\to R$ into a matrix representation $\rho:G\to\Aut(R^n)$ followed by the trace (multiplied by some scalar).

In this section, we discover how to classify equivalent matrix representations.

Given a matrix representation $\rho:G\to\Aut(R^n)$ , define the character of $\rho$ as $\chi_\rho=\tr\circ\rho$ , essentially taking the trace of each representative matrix for $g\in G$ . By construction, characters are one-dimensional representations $G\to R$ that are invariant under conjugation in $G$ . Because equivalent matrix representations only differ by conjugation in $G$ , representations that are invariant under conjugation are perfect for classifying matrix representations. In particular, each distinct value that $\chi_\rho(g)$ takes on is representative of a distinct conjugacy class in $G$ .

Assume we’re in an integral domain $R$ whose characteristic is $0$ , so that $\char R\nmid |G|$ is true for all finite $G$ . Then Maschke’s theorem lets us describe any group representation in terms of its irreps, and so we shift our focus of study towards irreps. Characters of irreps are known as irreducible characters.

Characters have a number of other properties in an integral domain, and much more in a field. Let’s go over the integral domain properties first.
- $\chi(e)$ is the degree of $\chi$ , which turns out to be equal to the dimension of the underlying representation.
  Theorem: $\chi(e)$ gives the dimension $n$ of its underlying representation in $\Aut(R^n)$ .
  
  Since the identity element in $G$ is always represented by the identity matrix in $\Aut(R^n)$ , taking the trace simply adds the $n$ ones along the diagonal.
- TODO
In this section, we view a strategy for classifying group representations over fields.

Recall Schur’s lemma:

Theorem: Any homomorphism $\sigma:M\to N$ between simple modules $M$ and $N$ is either trivial or an isomorphism.

A corollary (often also called Schur’s lemma) appears when $R$ is actually an algebraically closed field $F$ :
Schur’s Lemma: If the field $F$ is algebraically closed, then any finite irrep $\rho:G\to\Aut(F)$ must be a multiplication by a scalar.

Every automorphism $\phi\in\Aut(F)$ over a vector field is a matrix, and the characteristic polynomial of every matrix over an algebraically closed field $F$ has at least one root $\lambda$ .

Since $\lambda$ is an eigenvalue it is associated with an eigenspace, a subspace of $F$ . Since $\rho$ is an irrep, $F$ is simple, and by our previous form for Schur’s Lemma this means $\phi$ is either trivial or an isomorphism.

It’s not possible for $\phi$ to be trivial since we assume $\phi$ being an automorphism is not the zero map. So the image of $\phi$ must be all of $F$ , implying that $\phi-\lambda I$ is the zero map, thus $\phi$ is multiplication by a scalar.

Since $\phi$ is an automorphism by definition, it must be an isomorphism and not trivial. Since $\phi(x)=\lambda x$ , this means $\phi=\lambda I$ , which is multiplication by a scalar.
Character theory is most often used over the field of complex numbers $\CC$ . This is mostly because $\CC$ is an algebraically closed field, which gives us the above.

TODO either expand on character theory theorems like orthogonality in C OR go back and highlight the importance of

TODO: three proofs that link together to get schur’s lemma
- In an algebraically closed field, every matrix has eigenvalue
- if matrix has eigenvalue, its image is an eigenspace (by defn)
- if image of linear map is an eigenspace, the map is a multiplication by scalar
Let ϕ : ρ_1 → ρ_2 be a nonzero homomorphism.

prove it in 2 parts:
```
If ρ_2 is irreducible then ϕ is surjective.
If ρ_1 is irreducible then ϕ is injective. 
```
The inner product $\langle,\rangle$ is a function of two arguments $R^n\times R^n\to R$ that satisfy the following:

Conjugate Symmetry Linearity in the first argument Positive-definiteness

Irreducible characters are orthogonal

Orthogonal means ⟨φ,ψ⟩=1 if they’re the same representation up to isomorphism, and =0 otherwise.

Irreducible characters are orthogonal, proof is in lecture 22

We call this property character orthogonality, as in “by character orthogonality…”

Computing character tables over ℂ

A character table has irreducible characters χ as rows and conjugacy classes of G as columns. Then each entry is χ((g)) where g is in that conjugacy class of G. For example, let ζ = a primitive third root of unity. Then use the degree 1 irreducible χ⁽ᵏ⁾ = g ↦ ζᵏ. Then the cyclic group C₃ = ⟨g⟩ has the character table:
```
       1   1   1      ← size of the conjugacy class of C₃
      (1) (g) (g²)    ← the conjugacy classes of C₃
     ------------
χ⁽⁰⁾ | 1   1   1      ← trivial representation g ↦ [1]
χ⁽¹⁾ | 1   ζ   ζ²     ← dim 1 representation g ↦ [ζ]
χ⁽²⁾ | 1   ζ²  ζ      ← dim 1 representation g ↦ [ζ²]
```
If we define χ⁽ᵏ⁾ = g ↦ iᵏ (a dim 1 representation), then the cyclic group C₄ = ⟨g⟩ has the character table:
```
       1   1   1   1      ← size of the conjugacy class of C₃
      (1) (g) (g²)(g³)    ← the conjugacy classes of C₃
     ----------------
χ⁽⁰⁾ | 1   1   1   1      ← trivial representation g ↦ [1]
χ⁽¹⁾ | 1   i  -1  -i      ← dim 1 representation g ↦ [ζ]
χ⁽²⁾ | 1  -1   1  -1      ← dim 1 representation g ↦ [ζ²]
χ⁽³⁾ | 1  -i  -1   i      ← dim 1 representation g ↦ [ζ³]
```
The symmetric group S₃ = ⟨(1 2),(1 2 3)⟩ has the character table:
```
       1    1    1      ← size of the conjugacy class of C₃
      (1) (12) (123)    ← the conjugacy classes of C₃
     ---------------
1    | 1    1    1      ← trivial representation π ↦ [1]
χˢ   | 1   -1    1      ← dim 1 sign representation π ↦ sign(π)
χᵁ   | 2    0   -1      ← dim 2 representation π ↦ (number of fixed points of π) - 1
```
The dihedral group D₄ = ⟨r,s | r⁴=s²=e, srs = r⁻¹⟩ has the character table:
```
       1    1    1       1       1       ← size of the conjugacy class of C₃
      {e} {r²} {r,r³} {s,sr²} {sr,sr³}   ← the conjugacy classes of C₃
     ---------------------------------
χ₁   | 1    1    1       1       1       ← trivial representation g ↦ [1]
χ₂   | 1    1   -1       1      -1       ← dim 1 representation g ↦ ?
χ₃   | 1    1    1      -1      -1       ← dim 1 representation g ↦ -1 if reflection
χ₄   | 1    1   -1      -1       1       ← dim 1 representation g ↦ ?
χ₅   | 2   -2    0       0       0       ← dim 2 representation g ↦ ?
```
Things to note:
- The first row is all 1s (since [1] has trace 1)
- You can compute the last row using the rest of the rows
- The first column is always the dimension of the representation. This is because the representation is an identity matrix, which has (dim ) ones.
- A linear (degree 1) character describes a homomorphism : G → Cˣ. We can always find such homomorphisms by taking the abelianization Gᵃᵇ = G/[G,G] where [G,G] is the commutator subgroup of all xyx⁻¹y⁻¹.
- Row self-product: Self-inner product of a row with itself is always |G|, but you need to count differently. (If a conjugacy class has 3 elements, count that entry three times.)
- Column self-product: Self-inner product of a column with itself is always |C|² where |C| is the size of the conjugacy class.
- Row orthogonality: Inner product of two different rows is always zero, but you need to count differently. (If a conjugacy class has 3 elements, count that entry three times.)
- Column orthogonality: Inner product of two different columns is always zero. (Use this to compute the last row.)
- Magic formula: sum of squares of degrees of the irreducible characters equals |G|. (Use this to complete the first column.)
This website gives all the character tables for all groups with at most 10 irreps: https://people.maths.bris.ac.uk/~matyd/GroupNames/characters.html

TODO

Recall that a field $F$ is algebraically closed iff every nonconstant polynomial in $F[x]$ has a root in $F$ .
Theorem: Any linear operator has at least one eigenvalue in an algebraically closed field $F$ .

When we do a simple extension by an element $F(\alpha)$ , we adjoin the element $\alpha$ to $F$ and take the field of fractions. By definition, this is the smallest field containing $F$ and $\alpha$ .

Then $F(\alpha)(\beta)$ is the smallest field containing $F(\alpha)$ and $\beta$ , i.e. it is the smallest field containing $F$ , $\alpha$ , and $\beta$ .

Likewise, $F(\beta)(\alpha)$ is the smallest field containing $F(\beta)$ and $\alpha$ , i.e. it is the smallest field containing $F$ , $\beta$ , and $\alpha$ .

Therefore these are the exact same field.
Characters completely determine representations up to isomorphism

The original motivation for characters is to identify representations without regard to a basis or its matrix at all. There is also an amazing theorem that:

Theorem: ⟨χ,χ’⟩ = (1/N)∑_g (̅χ(g))χ’(g), which is just a normalized dot product in the complex vector space.

Corollary: In particular, ⟨χ,χ⟩=1 iff χ is irreducible. (This must be equal to ∑xᵢ², which can only be the dot product of a elementary basis vector with itself. Therefore irreducible)
Theorem: TODO

TODO
Cayley-Hamilton Theorem: Every (square) $R$ -matrix (for a commutative ring $R$ ) satisfies its own characteristic equation

TODO

Characteristic polynomials

Every $n\times n$ matrix $A$ over $F$ has a characteristic polynomial $\chi_A$ equal to $\det(xI-A)$ .

Motivation: To find the eigenvectors. $\chi_A(\lambda)=0$ iff $\lambda$ is an eigenvalue of $A$ . So just factor $\chi_A$ to get the eigenvalues.

Minimal polynomials

Recall you can evaluate a polynomial $m$ at a matrix $A$ : $m(A)$ .

Every $n\times n$ matrix $A$ over $F$ has a unique minimal polynomial $m_A$ that is monic, lowest degree, and $m_A(A)=0$ .

Motivation: the roots of $m_A$ are the eigenvalues. Proof: WTS $m_A(\lambda)=0$ iff $\lambda$ is an eigenvalue of $A.$ Since $m_A$ is lowest degree, there should be no other factors in $m_A$ other than the eigenvalues, if this is true. (->) $m_A\mid\chi_A$ , so $\chi(\lambda)=0$ , which means $\lambda$ is an eigenvalue of $A$ . (<-) We know $m_A(A)=0$ so we know $m_A(\lambda)\cdot v=0$ . If you factor $m_A$ in some field where it splits into $x-\mu_i$ , then we must have $(x-\mu_i)\cdot v=0$ for some $\mu_i$ .

Another motivation: to detect diagonalizability of $A$ . If $m_A$ factors into distinct factors it’s diagonalizable. Proof involves finding the Jordan block matrix conjugate to $A$ …

Given $f$ is the minimal polynomial of $\alpha$ over $F$ , we can make an isomorphism: $\varphi:F[x]/(f)\iso F(\alpha)$ as $\varphi=g+(f)\mapsto g(\alpha)$

Conjugation doesn’t change these polynomials

Recall a conjugate matrix of $A$ is a matrix $PAP^{-1}$ where $P$ is any matrix of the same size in the same field as $A$ . Note:
- $f(PAP^{-1})=f(A)$
- $\det(tI-PAP^{-1})=\det(xPIP^{-1}-PAP^{-1})=\det(P(xI-A)P^{-1})=\det(xI-A)$
So conjugation doesn’t change either of these polynomials.

Cayley-Hamilton theorem: the characteristic polynomial χ_A of T is det(tI-T)=f₁f₂…fᵣ ∈ F[t]. Then χ_T(T) = O.

Character theory: - We care about the conjugacy classes of groups, since they help us find the irreducible characters of groups

modules

In this section, we describe RRR-modules beyond free modules.

In this section, we describe how to construct RRR-modules.

In this section, we start exploring RRR-matrices for real.

In this section, we show how row and column operations can be used to simplify a matrix.

In this section, we abstractly represent relationships between RRR-modules via RRR-matrices.

In this section, we describe properties of RRR-modules with only diagrams.

In this section, we utilize exact sequences in the context of RRR-modules specifically.

In this section, we describe how to derive the RRR-matrix that presents a given RRR-module.

In this section, we learn the conditions under which an RRR-module is finitely generated.

In this section, we explore the properties of Noetherian rings.

In this section, we examine the components of finitely generated RRR-modules.

In this section, we explore properties of R[t]R[t]R[t]-modules when R[t]R[t]R[t] is a PID.

In this section, we derive the notion of eigenvalues in module theory.

In this section, we display the module action of polynomial rings.

In this section, we show the module analog of group actions.

In this section, we consider representations as algebraic structures in their own right.

In this section, we try to find irreps of any matrix representation.

In this section, we find an easier way to discover the eigenvalues in an algebraically closed field.

In this section, we explore some facts about class functions.

In this section, we discover how to classify equivalent matrix representations.

In this section, we view a strategy for classifying group representations over fields.

Irreducible characters are orthogonal

Computing character tables over ℂ

TODO

Characters completely determine representations up to isomorphism

Characteristic polynomials

Minimal polynomials

Conjugation doesn’t change these polynomials

In this section, we describe $R$ -modules beyond free modules.

In this section, we describe how to construct $R$ -modules.

In this section, we start exploring $R$ -matrices for real.

In this section, we abstractly represent relationships between $R$ -modules via $R$ -matrices.

In this section, we describe properties of $R$ -modules with only diagrams.

In this section, we utilize exact sequences in the context of $R$ -modules specifically.

In this section, we describe how to derive the $R$ -matrix that presents a given $R$ -module.

In this section, we learn the conditions under which an $R$ -module is finitely generated.

In this section, we examine the components of finitely generated $R$ -modules.

In this section, we explore properties of $R[t]$ -modules when $R[t]$ is a PID.