Exploration 2: Module actions

January 2, 2024.

Questions:

How can we represent algebraic objects as actions?

Recall that $R$ -matrices are homomorphisms between free modules $R^n$ . How do we visualize homomorphisms between non-free modules $M$ ?

Let’s first focus on endomorphisms. Every endomorphism $T:M\to M$ must satisfy the homomorphism laws for $R$ -modules $M$ :

$T(m+n)=T(m)+T(n)$ for all $m,n\in M$
$T(km)=kT(m)$ for all $k\in R,m\in M$

In other words, $T$ is $R$ -linear.

$R$ -linearity allows us to define module actions. Here, since $T$ is $R$ -linear, $T$ respects addition and scalar multiplication and therefore can be described as acting on $M$ . We can express this by extending the $R$ -module $M$ to a $R[t]$ -module, which is an $R$ -module together with an action represented by the indeterminate $t$ . When the action is scalar multiplication, a $R[t]$ -module is just a module over the polynomial ring $R[t]$ .

Instead of scalar multiplication by $t$ , define $t$ to be an application of $T$ , and define the action of constant polynomials $r\in R$ to be scalar multiplication by $r$ . Thus each endomorphism $T:M\to M$ is associated with an $R[t]$ -module $M'$ whose action is to apply some linear combination of $T$ , represented by a polynomial in $R[t]$ .

By this construction, every endomorphism is associated with a unique $R[t]$ -module, and therefore studying these $R[t]$ -modules is enough to classify the endomorphisms.

In this section, we explore properties of $R[t]$ -modules when $R[t]$ is a PID.

When $R[t]$ is a PID, it’s equivalent to saying that $R$ is a field $F$ . So we’ll refer to $R[t]$ as $F[t]$ in this section.

Importantly, it turns out $F[t]$ -modules are torsion, i.e. it has a nontrivial annihilator $\Ann_{F[t]}(M)$ .

Theorem: Given a

F[t]

-module

M

where

t

acts as an endomorphism

M\to M

, there is some polynomial

f\in F[t]

whose corresponding endomorphism

\in\End(M)

is the zero map. In other words,

\Ann_{F[t]}(M)

is nontrivial.

Since $t$ acts as $T\in\End(M)$ , we can construct a ring homomorphism $\varphi:F[t]\to\End(M)$ that maps linear combinations of $t$ ( $\sum_i a_it^i$ ) to their corresponding endomorphism ( $\sum_i a_iT^i$ ).
The kernel of $\varphi$ represents all polynomials $\in F[t]$ that map to the zero map in $\End(M)$ . Thus $\Ann_{F[t]}(M)=\ker\varphi$ , and it is enough to show that $\varphi$ has a non-trivial kernel.
Given that $\End(M)$ is finitely generated, and that polynomial rings $F[t]$ are always infinitely generated (by $\{1,t,t^2,\ldots\}$ , $\varphi$ is a mapping from an infinitely generated set to a finitely generated set, thus cannot be injective.
Since injectivity corresponds with having a trivial kernel, $\varphi$ has a non-trivial kernel.

Corollary: Every finitely generated

F[t]

-module

M

(where

t

acts as an endomorphism

M\to M

) is torsion.

By definition, a torsion module is one with a non-trivial annihilator, which we proved true earlier.

Note that this result can be obtained via the Chinese remainder theorem which says $F[t]/(f\cdot g)\iso F[t]/(f)\oplus F[t]/(g)$ when $f,g$ coprime. Observe that two different linear factors $t-\alpha$ will always be coprime in $F[t]$ , thus every $f\in F[t]$ can be split completely into linear factors and

Theorem: Torsion

R

-modules

M

have no nontrivial free submodules.

If $M$ is torsion, then it has a nonzero element $a\in R$ such that $am=0$ for all $m\in M$ . Thus all elements of $M$ can be zeroed by $a$ , and this doesn’t change when you take any submodule of $M$ .

Because we’re working over a PID, we can apply the structure theorem to decompose such a $F[t]$ -module $M$ into a direct sum of free submodules $\iso F[t]$ and cyclic torsion submodules $\iso F[t]/(f^k)$ (for an irreducible polynomial $f$ ).

$M\iso\underbrace{\left(F[t]\right)^n}_{\text{free}}\oplus\underbrace{F[t]/(f_1^{k_1})\oplus F[t]/(f_2^{k_2})\oplus\ldots\oplus F[t]/(f_n^{k_n})}_{\text{torsion}}$

But $M$ , being torsion, has no nontrivial free submodules, so there is no free part:

$M\iso F[t]/(f_1^{k_1})\oplus F[t]/(f_2^{k_2})\oplus\ldots\oplus F[t]/(f_n^{k_n})$

In summary, we’ve reduced the problem of classifying endomorphisms to studying the torsion cyclic $F[t]$ -submodules $F[t]/(f^k)$ of their corresponding $F[t]$ -module.

In this section, we derive the notion of eigenvalues in module theory.

Given a $F$ -vector space $V$ , consider the endomorphism $T:V\to V$ . As before, define a $F[t]$ -module $V_T$ as $V$ where the action of $t$ is to apply $T$ . Since $V_T$ is a torsion module over a PID, apply the structure theorem to break it into a direct sum of torsion cyclic modules $\bigoplus_i F[t]/(f_i)$ where $f_1\mid f_2\mid\ldots\mid f_n$ . Note that each $f_i$ generates the annihilator for their corresponding cyclic module $F[t]/(f_i)$ . In this context where the module $M$ is defined over a polynomial ring $F[t]$ , we say that $f_i$ is the minimal polynomial for $M$ if it is the lowest degree polynomial that annihilates $M$ .

Theorem:

f

is the minimal polynomial for a cyclic module

F[t]/(f)

Every element in the annihilator of $F[t]/(f)$ annihilates $F[t]/(f)$ by definition. Since $f$ is the generator of the annihilator of $F[t]/(f)$ , it must divide every element of this annihilator, and therefore is the least-degree element in the annihilator. Thus $f$ is the lowest degree polynomial that annihilates $F[t]/(f)$ .

Theorem: Every

F[t]

-module

M

(where

t

acts as an endomorphism

M\to M

) has a minimal polynomial

f\in F[t]

that generates the annihilator of

M

We know that the annihilator of $M$ is the kernel of the homomorphism $\varphi:F[t]\to\End(M)$ . Since the kernel is an ideal of $F[t]$ , a PID, the kernel must be generated by a single element $f\in F[t]$ . Since $F$ is a field, we can choose $f$ to be monic, since if it’s not monic we can multiply $f$ by the inverse of its leading coefficient to get a monic $f$ . This makes $f$ the lowest degree monic polynomial in the annihilator of $M$ .

In $R$ -module $M$ , an eigenvalue $\lambda$ of an $R$ -module endomorphism $T:M\to M$ is one where there exists a nonzero element $m\in M$ such that $Tm=\lambda m$ .

An endomorphism $T:M\to M$ is nilpotent if some power of $T$ is equal to the zero map.

Lemma: An

R

-module endomorphism

T:M\to M

has an eigenvalue

\lambda

T-\lambda I

is nilpotent.

If $T-\lambda I$ is nilpotent, then $(T-\lambda I)^k=0$ for some $k\ge 1$ .
If $k=1$ , then we have $T-\lambda I=0$ implying $T=\lambda I$ , which directly shows that $\lambda$ is an eigenvalue.
Otherwise if $k\ge 2$ , being nilpotent means $(T-\lambda I)^{k-1}\ne 0$ . So there is some element $m\in M$ such that $n=(T-\lambda I)^{k-1}m\ne 0$ . But then we have $(T-\lambda I)n=(T-\lambda I)\left[(T-\lambda I)^{k-1}m\right]=(T-\lambda I)^km=0$ Since there exists an element $n\in M$ where $(T-\lambda I)n=0$ , this proves $Tn=\lambda n$ thus $\lambda$ is an eigenvalue of $T$ .

A counterexample for the converse is the identity $I$ , which has eigenvalue $\lambda=1$ but is not nilpotent.

Lemma: The eigenvalues of an endomorphism of a direct sum

T:M\oplus N\to M\oplus N

are exactly the eigenvalues of

T_M

and

T_N

combined (where

T_M

denotes

T

restricted to

M

It is enough to show that the eigenvalues of $T_M$ are eigenvalues of $T$ , since the $N$ case has an identical proof. If $T_M$ has an eigenvalue $\lambda$ , then $T_Mm=\lambda m$ for some $m\in M$ . Thus in $M\oplus N$ , we have $T(m,0)=(T_Mm,T_N0)=(\lambda m,0)=\lambda(m,0)$ proving that the eigenvalue $\lambda$ of $M$ is an eigenvalue of $M\oplus N$ .

Theorem: For a torsion component

F[t]/(f)

of a

F[t]

-module

M

, the roots

\lambda

of its minimal polynomial

f

correspond to eigenvalues of

T

(the action of

t

If $\lambda$ is a root of $f$ , then by the factor theorem, $(t-\lambda)$ is an irreducible factor of $f$ . Note that in a UFD like $F[t]$ , irreducibles and primes are the same thing.
Since the annihilator is generated by a prime power, and we know that the prime is $(t-\lambda)$ , the annihilator is exactly $(t-\lambda)^k$ for some $k\ge 1$ . So we’re working with $F[t]/(t-\lambda)^k$ .
Recall that polynomials in $F[t]$ act as endomorphisms on $M$ . The endomorphism corresponding to this polynomial $(t-\lambda)^k$ should look like $(T-\lambda I)^k$ , since we defined the action of $F[t]$ to treat constant polynomials like $\lambda$ as scalar multiplication.
Since $\lambda$ is a root, we have $(T-\lambda I)^k=0$ — that is, $T-\lambda I$ is nilpotent on the given component $F[t]/(t-\lambda)^k$ , implying that $\lambda$ is an eigenvalue of $T$ when restricted to the component $F[t]/(t-\lambda)^k$ .
But since $M$ is a direct sum including $F[t]/(t-\lambda)^k$ , and therefore $\lambda$ is also an eigenvalue of $T$ .

Theorem: The minimal polynomial for a

F[t]

-module

M

is exactly the largest invariant factor

f_n

M

If the decomposition is $M\iso\bigoplus_iF[t]/(f_i)$ , then since the roots of each $f_i$ are eigenvalues of $M$ , the minimal polynomial must contain a factor $(t-\lambda)$ for each eigenvalue $\lambda$ for each component. This means the minimal polynomial is the LCM of all $f_i$ . Since the decomposition has $f_1\mid f_2\mid\ldots\mid f_n$ , we can observe that $f_n$ is a multiple of every $f_i$ , making $f_n$ the minimal polynomial of $M$ .

Since the roots of $f$ for each $F[t]/(f)$ correspond to the eigenvalues of $T:M\to M$ (the action of $t$ ), it is useful to encapsulate all of the eigenvalues (including repeats) as the characteristic polynomial $\chi_T\in F[t]$ : a polynomial with a root $\lambda$ every time $\lambda$ appears as an eigenvalue of $T$ . Therefore:

Theorem: The characteristic polynomial for a

F[t]

-module

M

is exactly the product of all invariant factors

f_i

M

This follows directly from knowing that the roots of each $f_i$ are eigenvalues of $M$ . Then the product of each $f_i$ captures all eigenvalues of $M$ .

Corollary: The characteristic polynomial of an endomorphism $T:M\to M$ for a finitely generated $F[t]$ -module $M$ is the product of the minimal polynomials $f_i$ of its cyclic components.

There is another way to compute this characteristic polynomial:

Theorem: Over a finitely generated

F[t]

-module

M

, the characteristic polynomial of an endomorphism

T:M\to M

is exactly

\det(T-tI)

Since $M$ is finitely generated, we can construct its presentation matrix $A$ so that $M\iso F[t]^n/\im A$ . The fact that $M$ is equivalent to a quotient of a free module $F[t]^n$ means all endomorphisms of $M$ (e.g. $T$ ) can be expressed as a $F[t]$ -matrix modulo the quotient $\im A$ .
Now consider the equation $Tm=tm$ , where we treat $t\in F[t]$ as a scalar. If this is true for some nonzero $m\in M$ , then $t$ is an eigenvalue of $T$ by definition.
Rearranging the above to $(T-tI)m=0$ reduces the problem of finding eigenvalues of $T$ to finding values of $t$ such that $T-tI$ zeroes some nonzero $m\in M$ .
Recall that we can define a Smith normal form $PDQ^{-1}$ for every matrix over a Bézout domain $F[t]$ , such as $T-tI$ . This results in a diagonal matrix $D=\left[\begin{matrix}f_1&&&\\&f_2&&\\&&\ddots&\\&&&f_n\end{matrix}\right]$ where $f_i$ are the invariant factors of $T-tI$ (which may differ from the invariant factors of $M$ .)
Thus our equation becomes $(PDQ^{-1})m=0$ , which we can simplify to $D(Q^{-1}m)=0$ . Since $Q$ is invertible, $Q^{-1}m$ is also an arbitrary nonzero element of $M$ , so WLOG we may write $Dm=0$ . Because $D$ is diagonal, we may rewrite this as a system of equations: $\begin{aligned} f_1m_1&=0\\ f_2m_2&=0\\ \vdots\\ f_nm_n&=0 \end{aligned}$ where the $m_i$ are the components of $m$ . Note that a nonzero $m$ only requires at least one of the $m_i$ to be nonzero. That means we can assume $m_i=0$ for all but one of the equations.
When $m_i$ is nonzero, then for the equation $f_im_i=0$ to be true, $f_i$ must be equal to zero. This is because $F[t]$ , being an integral domain, has no zero divisors. In summary, if we can find values $\lambda$ for $t$ that make any $f_i$ zero (i.e. the roots of $f_i$ ), then $Dm=0$ is true for some nonzero $m\in M$ , so $(T-\lambda I)m=0$ is true. So the roots of $f_i$ are eigenvalues of $T$ .
This means one can obtain every eigenvalue of $T$ by finding the roots of the product $\prod_i f_i$ . So the characteristic polynomial $\chi_T$ is $\prod_i f_i$ , which is exactly the determinant of $T-tI$ .

The companion matrix of a polynomial $f\in F[t]$ is a matrix constructed so that its characteristic polynomial is exactly $f$ . Given a polynomial $t^n+a_{n-1}t^{n-1}+\ldots+a_1t+a_0$ , its companion matrix is $t^n+a_{n-1}t^{n-1}+\ldots+a_1t+a_0\iff\left[\begin{matrix} 0&0&\cdots&0&-a_0\\ 1&0&\cdots&0&-a_1\\ 0&1&\cdots&0&-a_2\\ \vdots&\vdots&\ddots&0&\vdots\\ 0&0&\cdots&1&-a_{n-1} \end{matrix}\right]$

Corollary: All

F

-matrices

M

have at least one eigenvalue iff

F

is an algebraically closed field.

For the forward direction: since the companion matrix for every polynomial $\in F[x]$ (of degree $\ge 1$ ) has an eigenvalue, every polynomial $\in F[x]$ (of degree $\ge 1$ ) has a root in $F$ , therefore $F$ is algebraically closed.
The backward direction is because in an algebraically closed field, all degree $\ge 1$ polynomials in $F[t]$ have a root in $F$ , including $\det(M-tI)$ , whose roots correspond to eigenvalues of $M$ .

Theorem: Given a

F[t]

-module

M

where

t

acts as an endomorphism

M\to M

M

is cyclic iff the minimal and characteristic polynomials coincide.

Let $M\iso\bigoplus_iF[t]/(f_i)$ by the structure theorem. Since the minimal polynomial is exactly $f_n$ and the characteristic polynomial is exactly $\prod_i f_i$ , they can only coincide when there is only one invariant factor. But that indicates that the decomposition includes only one cyclic submodule $F[t]/(f_i)$ , meaning the original module $M$ must be cyclic as well.
The proof in the other direction is trivial: if $M$ is cyclic $M\iso F[t]/(f)$ , then both $f_n$ and $\prod_i f_i$ are equal to $f$ .

If a matrix $T$ has eigenvalue $\lambda$ , then the kernel of $T-\lambda I$ is its corresponding eigenspace. If the image of $T$ is equal to one of its eigenspaces, the map is a multiplication by a scalar $\lambda$ .

In this section, we display the module action of polynomial rings.

We can express such a linear transformation as the action of a polynomial ring $F[x]$ on the $F$ -module.

Let’s explore the $F[t]$ -modules, where $F[t]$ is a polynomial ring defined over a field $F$ .

For instance, take the $F[t]$ -module $F[t]/(t-2)^3$ . Since it’s quotiented by a degree $3$ polynomial $(t-2)^3=t^3-6t^2+12t-8$ , the obvious basis is $\{1,t,t^2\}$ . We use the fact that $0=t^3-6t^2+12t-8$ in the quotient to get the constraint $T(t^2)=t^3=6t^2-12t+8$ . This means the $F[t]$ -matrix $T$ (with respect to the basis $\{1,t,t^2\}$ ) is one where $T(t^2)=6t^2-12t+8$ , i.e. we have $T\left[\begin{matrix}1\\0\\0\end{matrix}\right] =\left[\begin{matrix}0\\1\\0\end{matrix}\right],\quad T\left[\begin{matrix}0\\1\\0\end{matrix}\right] =\left[\begin{matrix}0\\0\\1\end{matrix}\right],\quad T\left[\begin{matrix}0\\0\\1\end{matrix}\right] =\left[\begin{matrix}8\\-12\\6\end{matrix}\right]$ which corresponds to the presentation matrix $\left[\begin{matrix}0&0&8\\1&0&-12\\0&1&6\end{matrix}\right]$

So we have found $T$ for the $F[t]$ -module $F[t]/(t-2)^3$ .

From the structure theorem, we know that direct sums of $F[t]$ -modules describe all $F[t]$ -modules. Let’s do this again where $M$ is a direct sum.

Take the $F[t]$ -module $M\iso F[t]/(t-2)\oplus F[t]/(t-2)^2$ . The obvious basis is $\{(1,0),(0,1),(0,t)\}$ . We do the same thing as before, realizing $0=t-2$ on the left and $0=(t-2)^2=t^2-4t+4$ on the right, and obtaining the constraints $T(1)=t=2$ on the left and $T(t)=t^2=4t-4$ on the right. We have: $T\left[\begin{matrix}1\\~\\~\end{matrix}\right] =\left[\begin{matrix}2\\~\\~\end{matrix}\right],\quad T\left[\begin{matrix}~\\1\\0\end{matrix}\right] =\left[\begin{matrix}~\\0\\1\end{matrix}\right],\quad T\left[\begin{matrix}~\\0\\1\end{matrix}\right] =\left[\begin{matrix}~\\-4\\4\end{matrix}\right]$ corresponding to the presentation matrix $\left[\begin{matrix}2&&\\&0&-4\\&1&4\end{matrix}\right] =\left[\begin{matrix}2\end{matrix}\right] \oplus\left[\begin{matrix}0&-4\\1&4\end{matrix}\right]$ where blank entries are zeroes.

So we’ve described how to find a presentation matrix for any $F[t]$ -module. We could reduce this matrix using the operations we can do on presentation matrices, but there is a kind of standard form for a presentation matrix in the case of $F[t]$ -presentation matrices.

The Chinese remainder theorem says $F[t]/(f\cdot g)\iso F[t]/(f)\oplus F[t]/(g)$ when $f,g$ coprime. Consider $F[t]/(f)$ . Since distinct linear factors are always coprime, we can always split $f$ to get a bunch of powers of linear factors, one for each repeated root. For instance, $F[t]/(t-\alpha)^2\oplus F[t]/(t-\alpha)^3$ .

For an easy example, consider the $F[t]$ -module $M\iso F[t]/(t-\alpha)^4$ . To find $T$ , first find some basis for $M$ (as a vector space). A clever basis is $\{1,(t-\alpha),(t-\alpha)^2,(t-\alpha)^3\}$ . This conveniently gives us the constraints $\begin{aligned} T(1)&=t(1)&&=(t-\alpha)^4+\alpha(t-\alpha)^3=\alpha(t-\alpha)^3\\ T(t-\alpha)&=t(t-\alpha)&&=(t-\alpha)^3+\alpha(t-\alpha)^2\\ T(t-\alpha)^2&=t(t-\alpha)^2&&=(t-\alpha)^2+\alpha(t-\alpha)\\ T(t-\alpha)^3&=t(t-\alpha)^3&&=(t-\alpha)+\alpha(1)\\ \end{aligned}$ Converting this into terms of basis vectors, this is $T\left[\begin{matrix}1\\0\\0\\0\end{matrix}\right] =\left[\begin{matrix}\alpha\\0\\0\\0\end{matrix}\right],\quad T\left[\begin{matrix}0\\1\\0\\0\end{matrix}\right] =\left[\begin{matrix}1\\\alpha\\0\\0\end{matrix}\right],\quad T\left[\begin{matrix}0\\0\\1\\0\end{matrix}\right] =\left[\begin{matrix}0\\1\\\alpha\\0\end{matrix}\right],\quad T\left[\begin{matrix}0\\0\\0\\1\end{matrix}\right] =\left[\begin{matrix}0\\0\\1\\\alpha\end{matrix}\right]$ which corresponds to the presentation matrix $\left[\begin{matrix}\alpha&1&&\\&\alpha&1&\\&&\alpha&1\\&&&\alpha\end{matrix}\right]$

Such a matrix is known as a Jordan matrix with eigenvalue $\alpha$ with multiplicity $4$ . It has $\alpha$ along the diagonal and ones on the superdiagonal, and this is precisely because we chose the basis in such a way that the coefficients in each constraint are $1$ and $\alpha$ . Next, here’s a direct sum example. Consider the $F[t]$ -module $M\iso F[t]/(t-\alpha)^4\oplus F[t]/(t-\alpha)^3\oplus F[t]/(t-\beta)^3$ . We’ll end up with the following presentation matrix, which is composed of matrices that are much alike the previous example: $\left[\begin{matrix}\alpha&1&&\\&\alpha&1&\\&&\alpha&1\\&&&\alpha\end{matrix}\right] \oplus\left[\begin{matrix}\alpha&1&\\&\alpha&1\\&&\alpha\end{matrix}\right] \oplus\left[\begin{matrix}\beta&1&\\&\beta&1\\&&\beta\end{matrix}\right]$

This is a direct sum of Jordan matrices! We call this the Jordan normal form of the presentation matrix, which always exists for $R$ -matrices over a PID $R$ . This is the standard form for a presentation matrix for such $R$ -modules.

In general, given a $R$ -module over a PID $R$ , you may:

Express the $R$ -module as a direct sum of cyclic modules $R/(f_i)$ where $f_i$ is the minimal polynomial of that cyclic component.
Split each $R/(f_i)$ into a direct sum of powers of linear factors (like $R/(t-\alpha)^2\oplus R/(t-\beta)^3$ ) via factoring the $f_i$ , which gives you the eigenvalues and their multiplicities.
Construct a direct sum of presentation matrices, which is in Jordan normal form.

All this to find the $F[t]$ -matrix $T$ that defines scalar multiplication by $t$ in a $F[t]$ -module.

< Back to category Exploration 2: Module actions (permalink)
Exploration 1: Modules and vector spaces Exploration 3: Representation theory

Exploration 2: Module actions

In this section, we explore properties of R[t]R[t]R[t]-modules when R[t]R[t]R[t] is a PID.

In this section, we derive the notion of eigenvalues in module theory.

In this section, we display the module action of polynomial rings.

In this section, we explore properties of $R[t]$ -modules when $R[t]$ is a PID.