groups

In my short experience: learning group theory is not about learning what the abstract notion of group is. It’s more about about studying specific groups (symmetric group, dihedral group, etc) which will show up in all the other theories (Galois theory, representation theory, etc).

Anyways, here are my notes on group theory!

June 26, 2020. An unconventional intro to group theory
I mean, we could start with the definition and axioms of a group (Wikipedia), but this is tiny theorems. The goal is to achieve intuition through tiny theorems rather than through big definitions.

I’m just going to repeat my short preface on group theory here:

In my short experience: when learning group theory for the first time, it is not about learning what the abstract notion of group is. It’s more about studying specific groups which will show up in all the other theories (field theory, Galois theory, representation theory, etc).

To understand this, we first need to talk about the parallel history of group theory.

In this section, we detail a parallel history of group theory.

Historically, the modern notion of group came about in several seemingly-unrelated lines of thought:
- (1800s) Gauss’ study of modular arithmetic
  
  …which led to his notion of groups (modern-day finite abelian groups)
- (1820s) Galois’ study of algebraic solutions to polynomial equations
  
  …which led to his notion of groups (modern-day symmetric groups).
- (1840s) Cauchy’s study of permutation theory
  
  …which led to his notion of groups (modern-day permutation groups).
- (1870s) Klein’s study of non-euclidean geometric transformations
  
  …which led to his notion of groups (modern-day symmetry groups)
It just happens that all these notions of groups have commonalities. For example, you can talk about
- Gauss’ primitive roots modulo n, or
- Galois’ proper decompositions, or
- permutation theory’s even permutations, or
- geometric translations of the plane.
Modern group theory characterizes all of these as normal subgroups of their respective groups. This means if we want to talk about an operation that works on all four of these groups, then we don’t need to have a special version of the operation for each group – we could just talk about that operation on, for example, normal subgroups. Similarly, the modern-day group abstracts away the similarities of these various groups, a process that started with Cayley. To this tiny timeline, we might add:
- (1850s-70s) Cayley’s study of abstract groups, via permutation groups and matrix groups
…which led to our modern notion of groups today.

Sources: U of St Andrews, C.K. Fong via Carleton University (PDF)

In this section, we approach group theory without looking at big definitions.

I find a lot of group theory texts start from the abstract definition without introducing this history. For me, this historical view of “group theory saving you from having to define four different operations” does really well when it comes to thinking about groups.

One of my bigger mistakes was being way too caught up in abstract laws, like the group axioms I never mentioned at the top of this page. That was something I had to unlearn. It’s not abstract groups and laws that are important at first – it’s the relationships between specific groups and specific things.

Let’s start by looking at permutations, and use those to arrive at the definition of a group.

In this section, we define and explore permutations.

A permutation of a set is a reordering of elements and is often written like this: $(1~4~2~8~5~7)$ . This permutation maps $1\mapsto 4$ , $4\mapsto 2$ , $2\mapsto 8$ , $8\mapsto 5$ , $5\mapsto 7$ , and finally $7\mapsto 1$ . The set we’re permuting here is some set of integers that includes $\{1,2,4,5,7,8\}$ .

This notation is called disjoint cycle notation; $(1~4~2~8~5~7)$ is a cycle. You can think of them as functions taking an integer to another integer. More complex permutations are composed of cycles via right-to-left function composition: the permutation $(1~4)(2~3)$ swaps $2$ and $3$ , and then swaps $1$ and $4$ . The identity permutation (which does nothing) can be written as a length $1$ cycle consisting of any element, like $(1)$ .

In this section, we compose and invert permutations.

Consider the permutation $(2~3)(3~1)$ . This first swaps $1$ and $3$ , and then swaps $2$ and $3$ . Since the net effect is the same as the cycle $(1~2~3)$ , we consider them to be the same permutation, in the sense that the set $\{(2~3)(3~1),(1~2~3)\}$ is a set consisting of one element.

In group theory, the important thing to focus on is not the set of integers that these permutations are operating on, but rather manipulating the permutations themselves. In this case, we can see that the composition of two permutations $(2~3)$ and $(3~1)$ is equal to another permutation $(1~2~3)$ . If you think of permutations as objects instead of functions, we have combined two objects into one (via function composition). This idea is very important.

The second important idea is that every permutation can be “undone” by an inverse permutation. For instance, the inverse permutation of $(1~2~3~4~5)$ is $(1~5~4~3~2)$ . You can tell that it is the inverse because their composition is always the identity permutation $(1)$ .

So far we’ve shown three facts:
- Composing two permutations gives you another permutation.
- There is an identity permutation that does nothing when composed.
- For every permutation, there is an inverse.
When these three facts are true for a set of permutations on $n$ elements, we call it a permutation group. In particular, the set of all $n!$ possible permutations on a set of $n$ elements is known as the symmetric group $S_n$ . I consider the permutation group foundational, but before we see why, let’s see more examples of groups.

In this section, we decompose permutations.

Every permutation can be described using cycle notation, a composition of cycles in which every element of the set we’re permuting appears exactly once. For example, the permutation $(1~2~3)$ on the set $\{1,2,3,4,5\}$ can be represented as $(1~2~3)(4)(5)$ , where $(4)$ and $(5)$ are both identity permutations that show that they are unaffected (“fixed”) by the permutation. This form is also known as a product of disjoint cycles.

You can always decompose any cycle of length $3$ or more, like $(a~b~c~d~e)$ , into a product of $2$ -cycles, also known as transpositions. In general, this “decomposition into transpositions” is not unique. For instance: $(1~2~3~4~5)=(1~2)(2~3)(3~4)(4~5)$ $(1~2~3~4~5)=(1~5)(1~4)(1~3)(1~2)$ $(1~2~3~4~5)=(5~4)(5~2)(2~1)(2~5)(2~3)(1~3)$ As you can see, this $5$ -cycle can be decomposed into any even number of transpositions. The first two forms are standard ways to decompose a cycle, while the third is just some random product of transpositions that happen to compose into $(1~2~3~4~5)$ .

It is a theorem that odd-length cycles decompose into an even number of transpositions, and even-length cycles decompose into an odd number of transpositions. We’ll prove this for all permutations:
Lemma: The identity permutation $(1)$ cannot be decomposed as a product of an odd number of transpositions.

Let’s express the identity as a product of transpositions $(1)=\tau_1\tau_2\ldots\tau_n$ , and let $a$ be an arbitrary element that is moved by one of the transpositions. This means that $a$ appears in $\tau_i$ for some $i$ . Let $\tau_i$ be the transposition containing the rightmost occurrence of $a$ . Note that $\tau_i$ cannot ever be the leftmost $\tau_1$ , since that would mean there is only one occurrence of $a$ in the product, meaning the product doesn’t fix $a$ and is therefore not the identity $(1)$ .

So let $\sigma$ be the transposition to the left of $\tau_i$ . If $\sigma$ doesn’t include $a$ , then we can freely flip the ordering $\sigma\tau_i$ with $\tau_i\sigma$ and try again with $\sigma$ being the new transposition to the left of $\tau_i$ . Eventually we’ll find some $\sigma$ that contains $a$ , since otherwise there would again be only one occurrence of $a$ in the decomposition meaning that $a$ is not fixed by the identity $(1)$ .

Then, using the fact that $\sigma\tau_i=\sigma\tau_i(\sigma^{-1}\sigma)=(\sigma\tau_i\sigma^{-1})\sigma$ , we can continue moving $\tau_i$ (in the form $\sigma\tau_i\sigma^{-1}$ ) to the left of $\sigma$ . There are two cases:

$\sigma\tau_i\sigma^{-1}$ contains $a$ . In this case, use that as the new $\tau_i$ and repeat with $\sigma$ being the transposition to the left of the new $\tau_i$ .

$\sigma\tau_i\sigma^{-1}$ doesn’t contain $a$ . This can only happen if $\tau_i=\sigma$ , so our original $\sigma\tau_i$ is the identity, leaving us with a product that is length $n-2$ . Then by induction on $n$ we eventually obtain either $n=0$ or $n=1$ . $n=1$ is not possible since the identity is not a transposition, therefore we end at $n=0$ by repeatedly subtracting $2$ from $n$ , showing that $n$ can only be even.
Theorem: A permutation that decomposes into an even number of transpositions cannot decompose into an odd number of transpositions, and vice versa.

Towards contradiction, assume that the permutation $\sigma$ can be decomposed into both an even and an odd number of transpositions. Let $\sigma=e_1e_2\ldots e_n$ be the even decomposition and $\sigma=o_1o_2\ldots o_m$ be the odd decomposition.

Then since $(1)=\sigma\sigma^{-1}$ , we can write $(1)=e_1e_2\ldots e_no_m^{-1}\ldots o_2^{-1}o_1^{-1}$ . Since this is an even number of transpositions composed with an odd number of transpositions, the total is an odd number of transpositions.

But by the lemma above, $(1)$ cannot be composed of an odd number of transpositions, contradiction.
Thus we can always refer to a permutation as either even or odd. If $\sigma$ can be decomposed into an even number of transpositions it is an even permutation, otherwise it is an odd permutation.

Recall that the symmetric group $S_n$ is the set of all permutations on $n$ elements where three facts hold:
- Composing two permutations gives you another permutation.
- There is an identity permutation that does nothing when composed.
- For every permutation, there is an inverse.
Note that if you take the set of all even permutations on $n$ elements, these three facts still hold:
- Composing two even permutations gives you another even permutation.
- There is an (even) identity permutation that does nothing when composed.
- For every even permutation, there is an inverse even permutation.
This set is known as the alternating group $A_n$ .

In this section, we look at characterizations of the cyclic group $C_n$ .

Let’s look at permutations again.

The following set of permutations, $\{(1),(1~2~3~4),(1~3)(2~4),(1~4~3~2)\}$ , is “cyclic” in the sense that “multiplying” any element by $(1~2~3~4)$ will give you the next item in the set. For instance, $(1~2~3~4)(1~3)(2~4)=(1~4~3~2)$ , and $(1~2~3~4)(1~4~3~2)=(1)$ . You can also multiply by $(1~4~3~2)$ to get the previous element instead.

Now let’s look at some different structures, which are similar in many ways:

The set of $4$ th roots of unity, $\{1,i,-1,-i\}$ , is “cyclic” in the sense that multiplying any element by $i$ will give you the next item in the set. For instance, $i(-1)=-i$ , and $i(-i)=1$ . You can also multiply by $-i$ to get the previous element instead.

This set of integers mod $5$ , $\{1,2,4,3\}$ , is “cyclic” in the sense that multiplying any element by $2$ will give you the next item in the set. For instance, $2\cdot 4\equiv 3\mod 5$ , and $2\cdot 3\equiv 1\mod 5$ . You can also multiply by $3$ to get the previous element instead.

This set of rotation matrices, $\{\left[\begin{matrix}1&0\\0&1\end{matrix}\right],\left[\begin{matrix}0&-1\\1&0\end{matrix}\right],\left[\begin{matrix}-1&0\\0&-1\end{matrix}\right],\left[\begin{matrix}0&1\\-1&0\end{matrix}\right]\}$ , is “cyclic” in the sense that multiplying any element by $\left[\begin{matrix}0&-1\\1&0\end{matrix}\right]$ will give you the next item in the set. For instance, $\left[\begin{matrix}0&-1\\1&0\end{matrix}\right]\left[\begin{matrix}-1&0\\0&-1\end{matrix}\right]=\left[\begin{matrix}0&1\\-1&0\end{matrix}\right]$ , and $\left[\begin{matrix}0&-1\\1&0\end{matrix}\right]\left[\begin{matrix}0&1\\-1&0\end{matrix}\right]=\left[\begin{matrix}1&0\\0&1\end{matrix}\right]$ . You can also multiply by $\left[\begin{matrix}0&1\\-1&0\end{matrix}\right]$ to get the previous element instead.

In general, these all correspond to the same set $\{1,g,g^2,g^3\}$ where multiplication by $g$ gives you the next item in the set, and $g^4=1$ so multiplying by $g^3$ gives you the previous item. Later we’ll recognize this as the cyclic group of order $4$ . A cyclic group is a group defined by a single generator $g$ which, by taking all products and inverses, generates all the elements of the group. Order $4$ means that there are $4$ elements of the group.

Why $4$ elements? Normally, taking all products and inverses of an arbitrary symbol $g$ gives you the infinite set $\{\ldots,g^{-3},g^{-2},g^{-1},e,g,g^2,g^3,\ldots\}$ . This cyclic group is known as the infinite cyclic group.

The cyclic group of order $4$ , however, is a finite cyclic group. All finite cyclic groups are essentially the infinite cyclic group together with a relation $g^n=e$ , in this case $g^4=e$ . Here, $n$ is the order of the group.

In this section, we finally define groups as a generalization of the above.

In general, groups are sets $G$ together with some product $\cdot$ defined that follows three rules:
- (identity) There is an identity element $e\in G$ whose product with any element $g\in G$ gives the same element $g$ .
- (closure under product) Taking the product of two elements $g,h\in G$ gives you another element in the set, denoted $gh\in G$ .
- (closure under inverse) Every element $g\in G$ has an inverse in the set, denoted $g^{-1}\in G$ , such that the product of any element with their inverse is the identity.
When these three are true, $(G,\cdot)$ is known as a group. For example, the integers under addition $(\ZZ,+)$ form a group. The following articles will dive into all aspects of groups, which I hope you’ll find interesting.

In this section, we outline some properties of groups.

Every group has a product $\cdot$ but we usually write $g\cdot h$ as just $gh$ .

There are a couple properties we know just from the axioms of a group:
- The product of any $g$ with the identity $e$ is equal to $g$ .
- The product of any $g$ with its inverse $g^{-1}$ is equal to $e$
- The inverse of a product $(gh)^{-1}$ is equal to $h^{-1}g^{-1}$ .
  
  Since $(h^{-1}g^{-1})(gh)=h^{-1}g^{-1}gh=h^{-1}h=e$ , it follows that $h^{-1}g^{-1}$ and $gh$ are inverses.
We will talk about groups in the abstract in the following explorations, where we will use these properties extensively.

lemmas we need

November 7, 2023. Exploration 1: Commutativity
Questions:
- What characterizes commutative elements of a group?
- What characterizes commutative subsets of a group?
- How do we make a group abelian?
- How do we make a group not abelian?
- What are quotient groups?
When taking products of group elements, like $gh$ , the ordering matters. In other words, $gh=hg$ is not true in general, unless $g$ or $h$ is the identity element $e$ . We saw this with the symmetric group, where the ordering of composing permutations matters.

In this exploration, we’ll look at all the ways $gh=hg$ can actually be true. That is, we’ll explore how we can actually make $g$ commute with $h$ .

In this section, we look at elements that commute with everything.

Every group has an identity element $e$ , and the identity element always commutes with any element: $eg=g=ge$ . In the worst case, the identity element is the only element that commutes with every element.

On the flipside, abelian groups are exactly those groups where every element commutes with every element. So at maximum, every element commutes with every element. The cyclic group is a good example of this.

When some given element $g$ commutes with every element in the group, we say that $g$ is central. The central elements of a group $G$ are collectively referred to as the center of $G$ , written $Z(G)$ . As discussed, the center is always in between the minimum “just the identity element” (a trivial center) and the maximum “every element” (in the case of abelian groups.)

Example: The group of integers under addition is abelian.

Integer addition is commutative, so $g+h=h+g$ for all $g,h\in\ZZ$ .
Example: The symmetric group with $n\ge 3$ elements has a trivial center.

To be in the center, a permutation $\sigma$ must commute with every permutation.

In particular, $\sigma$ must commute with $(a~b)$ , so we have $\sigma(a~b)=(a~b)\sigma$ , which implies $\sigma$ either fixes $a,b$ or swaps $a,b$ .

Using the same logic for $(b~c)$ , $\sigma$ either fixes $b,c$ or swaps $b,c$ . But $\sigma$ can’t swap $b,c$ since that would mean $\sigma$ neither fixes nor swaps $a,b$ (using the assumption that $n\ge 3$ so that $a,b,c$ are distinct.) So the only option is that $\sigma$ fixes $a,b,c$ .

Applying this argument inductively to all elements $a,b,c,d,\ldots$ we show that $\sigma$ must fix every element in order to commute with all these $2$ -cycles. Therefore, $\sigma$ can only be the identity permutation.
Example: The center of the group of invertible $n\times n$ matrices under matrix multiplication are exactly scalar multiples of the identity matrix.

Let $Z$ be a central matrix, so that it commutes with every invertible matrix in the group.

In particular, it commutes with the matrix $E=I+E_{ij}$ (where $i\ne j$ ), which is the identity matrix except with a $1$ at an off-diagonal entry at row $i$ column $j$ . For example: $E=\left[\begin{matrix}1&0&0\\0&1&1\\0&0&1\end{matrix}\right]\quad(\text{for }n=3,i=2,j=3)$ This matrix $E$ is a member of our group because it is invertible (it is triangular and therefore $\det(E)$ is the product of the diagonal, which is $1$ , thus invertible.)

Note that by matrix multiplication, $(ZE)_{ik}$ is equal to the inner product of (the $i$ th row of $Z$ ) and (the $k$ th column of $E$ ), which by construction is exactly $(ZE)_{ik}=\begin{cases}Z_{ik}+{\color{red}0}&\text{ for }k\ne j\\Z_{ik}+{\color{red}Z_{ii}}&\text{ for }k=j\end{cases}$ $ZE=Z(I+E_{ij})=Z+\left[\begin{matrix}Z_{1,1}&Z_{1,2}&Z_{1,3}\\Z_{2,1}&Z_{2,2}&Z_{2,3}\\Z_{3,1}&Z_{3,2}&Z_{3,3}\end{matrix}\right]\left[\begin{matrix}0&0&0\\0&0&1\\0&0&0\end{matrix}\right]=Z+\left[\begin{matrix}0&0&Z_{1,2}\\{\color{red}0}&{\color{red}0}&{\color{red}Z_{2,2}}\\0&0&Z_{3,2}\end{matrix}\right]$ Similarly, $(EZ)_{kj}=\begin{cases}Z_{kj}+{\color{blue}0}&\text{ for }k\ne i\\Z_{kj}+{\color{blue}Z_{jj}}&\text{ for }k=i\end{cases}$ $EZ=(I+E_{ij})Z=Z+\left[\begin{matrix}0&0&0\\0&0&1\\0&0&0\end{matrix}\right]\left[\begin{matrix}Z_{1,1}&Z_{1,2}&Z_{1,3}\\Z_{2,1}&Z_{2,2}&Z_{2,3}\\Z_{3,1}&Z_{3,2}&Z_{3,3}\end{matrix}\right]=Z+\left[\begin{matrix}0&0&{\color{blue}0}\\Z_{3,1}&Z_{3,2}&{\color{blue}Z_{3,3}}\\0&0&{\color{blue}0}\end{matrix}\right]$

Since $ZE=EZ$ due to $Z$ being in the center, equate the above at index $i,j$ to obtain $\begin{aligned} (ZE)_{ij}&=(EZ)_{ij}\\ Z_{ij}+{\color{red}Z_{ii}}&=Z_{ij}+{\color{blue}Z_{jj}}\\ {\color{red}Z_{ii}}&={\color{blue}Z_{jj}}\\ \end{aligned}$ for all $i\ne j$ . This implies all the diagonal entries of $Z$ are equal.

As for the off-diagonal entries, note that $ZE_{ij}$ is $Z_{ki}$ only on column $j$ and zero otherwise, and $E_{ij}Z$ is $Z_{jk}$ only on row $i$ and zero otherwise. That means for $k\ne i$ , $(ZE)_{kj}=(EZ)_{kj}\implies Z_{kj}+Z_{ki}=Z_{kj}+0\implies Z_{ki}=0$ and for $k\ne j$ , $(ZE)_{ik}=(EZ)_{ik}\implies Z_{ik}+0=Z_{ik}+Z_{jk}\implies Z_{jk}=0$ meaning that any off-diagonal entry of $Z$ must be zero, i.e. $Z$ is diagonal.

Since $Z$ is diagonal with every diagonal entry equal, $Z$ must be a scalar multiple of the identity matrix $\lambda I$ . So every central matrix must be in the form $\lambda I$ .

Conversely, all matrices in the form $\lambda I$ are central because $(\lambda I)M=\lambda MI=M(\lambda I)$ for an arbitrary matrix $M$ .

Therefore the center of the group of invertible matrices consists of exactly the scalar multiples of the identity matrix.
In this section, we learn about subsets that commute with everything.

Being a central element is a rather strong property. A central element must commute with every element.

Instead of considering single elements that commute with all elements in a group, consider subsets that commute with all elements in a group. Let $S\subseteq G$ be a subset of $G$ , and let $Sg$ (resp. $gS$ ) represent the result of right-multiplying (resp. left-multiplying) all of $S$ by some $g\in G$ . Then if $Sg=gS$ for all $g\in G$ , we can characterize $S$ as something like being a “central” subset of $G$ . How do its properties differ from those of central elements?

Notice that since $Sg=gS$ implies $S=gSg^{-1}$ for all $g\in G$ , $S$ has the property that for any of its elements $s\in S$ , then the element $gsg^{-1}$ is also in $S$ , for every $g\in G$ .

This map $s\mapsto gsg^{-1}$ is called conjugation by $g$ . When $t=gsg^{-1}$ , we say that $t$ is a conjugate of $s$ by $g$ .

So our earlier property that $Sg=gS$ for all $g\in G$ can be rewritten as $S=gSg^{-1}$ for all $g\in G$ . Since the idea is that $S$ is left unchanged after conjugation by arbitrary $g$ , we call this property invariance under conjugation. Specifically, a set $S$ or element $x$ is invariant under conjugation when conjugation by every $g\in G$ leaves $S$ or $x$ unchanged, i.e. $gSg^{-1}=S$ or $gxg^{-1}=x$ .
Theorem: The central elements are exactly the elements invariant under conjugation.

$z$ is central iff $zg=gz$ for all $g\in G$ .

$z$ is invariant under conjugation iff $z=gzg^{-1}$ for all $g\in G$ .

But again, $zg=gz$ and $z=gzg^{-1}$ imply each other, so these conditions are equivalent.
Theorem: The subsets $S$ that commute with every $g\in G$ are exactly those that are invariant under conjugation.

To commute with every $g\in G$ , $S$ must satisfy $Sg=gS$ for all $g\in G$ .

To be invariant under conjugation, $S$ must satisfy $S=gSg^{-1}$ for all $g\in G$ .

But $Sg=gS$ and $S=gSg^{-1}$ imply each other, so these conditions are equivalent.
Corollary: The union of two subsets $S,T$ invariant under conjugation is also invariant under conjugation.

Since group product distributes over set union, we have $(S\cup T)g=Sg\cup Tg$ for all $g\in G$ . But since $S,T$ are invariant under conjugation, we have from the previous theorem that both $S,T$ commute with every element $g\in G$ , and therefore $(S\cup T)g=Sg\cup Tg=gS\cup gT=g(S\cup T)$ implying that $S\cup T$ commutes with everything and is therefore invariant under conjugation.

So if we want to study subsets that are “central”, we can think about constructing subsets that are invariant under conjugation. The obvious way to create such subsets is to take every conjugate of some element $g\in G$ to arrive at the subset $[g]$ . Since every conjugate of $g$ is in $[g]$ , $[g]$ must be invariant under conjugation. We can prove that more rigorously:
Theorem: The set $[g]$ of all conjugates of $g\in G$ is invariant under conjugation.

Since every element in $C$ is a conjugate of $g$ by some element $h\in G$ , we can represent every element of $[g]$ as $hgh^{-1}$ .

To show that conjugating this element (by arbitrary $k\in G$ ) gives the element in $[g]$ , note that $k(hgh^{-1})k^{-1}=(kh)g(kh)^{-1}\in [g]$ , i.e conjugating an arbitrary element $hgh^{-1}\in [g]$ by an arbitrary element $k\in G$ results in an element $(kh)g(kh)^{-1}$ in $[g]$ , thus $[g]$ is invariant under conjugation.
Thus every element $g\in G$ gives rise to a subset $[g]\subseteq G$ that contains exactly all the conjugates of $g$ , which automatically makes it invariant under conjugation.

In this section, we discuss equivalence relations.

Imagine doing this process for every element in $G$ , so that for every element $g\in G$ you can identify a subset $[g]$ that $g$ belongs to.

In fact, every element $g$ belongs to exactly one subset $[g]$ . To see this, note that if $g\in [g]$ was also in a second subset $[h]$ , then $g$ must be a conjugate of $h$ . But since $[g],[h]$ must both be invariant under conjugation (as we just proved), every conjugate of $g$ must also be in $[h]$ (so $[g]\subseteq [h]$ ), and every conjugate of $h$ must also be in $[g]$ (so $[h]\subseteq [g]$ .) This implies $[g]=[h]$ .

Then the subsets $[g]$ collectively partition the set $G$ . A partition of a set is defined as any grouping of all elements into subsets such that each element belongs to exactly one subset.

The reason that conjugation gives rise to a partition is due to three essential properties:
- Conjugation is transitive: if $a,b$ are conjugate by $h$ , and $b,c$ are conjugate by $k$ , then $a,c$ are conjugate by $kh$ . (We actually proved this earlier.) This ensures that if $g\in [h]$ , then $g$ is conjugate to $h$ and everything $h$ is conjugate to, thus $[h]\subseteq [g]$ .
- Conjugation is symmetric: if $g$ is conjugate to $h$ , then $h$ is conjugate to $g$ . This ensures that in the previous scenario, $h$ is also conjugate to everything $g$ is conjugate to, thus $[g]\subseteq [h]$ . Together with the previous result, this implies $[g]=[h]$ . Thus $g\in [h]$ implies $[g]=[h]$ , meaning $g$ cannot be in more than one distinct subset.
- Conjugation is reflexive: every element is conjugate to itself by $e$ . This ensures that every element belongs to some subset, i.e. every element $g$ belongs to at least one subset. Together with the previous result, this means every element belongs to exactly one subset.
So any relation that is transitive, symmetric, and reflexive gives rise to a partition of the underlying set into these subsets $[g]$ . Such a relation is called an equivalence relation, often denoted $\sim$ . The partitions $[g]$ are called equivalence classes, each containing all elements equivalent to its representative $g$ by the given equivalence relation $\sim$ .

Like all equivalence relations, conjugation partitions the group into equivalence classes, called conjugacy classes. Each conjugacy class $[g]$ contains all elements $hgh^{-1}$ conjugate to $g$ .

In this section, we learn how to determine the size of conjugacy classes.

The size of a conjugacy class $|[g]|$ is the number of distinct elements in the form $hgh^{-1}$ for some $h\in G$ . How do we determine the number of distinct conjugates of $g$ ?

To think about this problem, imagine taking the conjugate of $g$ by every element in $G$ , so you have potentially $|G|$ conjugates of $g$ . Whenever two conjugates of $g$ are equal, $aga^{-1}=bgb^{-1}$ , then they are two ways to write the same conjugate. Note that this condition $aga^{-1}=bgb^{-1}$ is trivially an equivalence relation $a\sim b$ , since it’s based on equality:
- Reflexivity: $aga^{-1}=aga^{-1}$ .
- Symmetry: $aga^{-1}=bgb^{-1}$ implies $bgb^{-1}=aga^{-1}$
- Transitivity: if $aga^{-1}=bgb^{-1}$ and $bgb^{-1}=cgc^{-1}$ , then $aga^{-1}=cgc^{-1}$ .
NB: This $\sim$ is a different equivalence relation than the one we gave for conjugacy.

Being an equivalence relation, $\sim$ divides $G$ into equivalence classes, where two elements $a,b$ are in the same equivalence class iff conjugating $g$ by $a$ and by $b$ results in the same element. Since each equivalence class represents a distinct conjugate of $g$ , the number of distinct conjugates of $g$ is equal to the number of equivalence classes.

So, how do we find the number of equivalence classes?

Here is the key insight: the equality $aga^{-1}=bgb^{-1}$ can be rewritten as $(b^{-1}a)g=g(b^{-1}a)$ meaning two conjugates of $g$ are equal every time the element $b^{-1}a$ commutes with $g$ . Let $C_G(g)$ denote the set of such elements in $G$ that commute with $g$ , called the centralizer of $g$ . Then the condition $aga^{-1}=bgb^{-1}$ simplifies to the condition $b^{-1}a\in C_G(g)$ , which further simplifies to $a\in bC_G(g)$ . This directly characterizes the equivalence classes: every equivalence class $[a]$ under $\sim$ is in the form $bC_G(g)$ for some $b\in G$ .

There is a bijection between any two equivalence classes $bC_G(g)$ and $aC_G(g)$ . Simply left-multiply $bC_G(g)$ by $ab^{-1}$ to obtain $aC_G(g)$ . This is a bijection because there exists an inverse: left-multiplying by $ba^{-1}$ . Why is this important? Because the existence of a bijection between every equivalence class implies that every equivalence class has the same size: $|C_G(g)|$ .

If the size of every equivalence class is $|C_G(g)|$ , then the number of equivalence classes is $|G|/|C_G(g)|$ . Since each equivalence class represents a distinct conjugate of $g$ , we have obtained the number of distinct conjugates of $g$ : $|[g]|=|G|/|C_G(g)|$

Thus the size of the conjugacy class $[g]$ is the order of the group divided by the size of $g$ ’s centralizer. This reduces the problem to finding the size of a centralizer. For example:

Example: The centralizer of every element in an abelian group is the group itself.

Since everything commutes with everything in an abelian group, the centralizer of every element consists of the whole group.

Corollary: Abelian groups are exactly the groups where every conjugacy class is of size $|G|/|C_G(g)|=1$ .

We can also do this for permutations and matrices, but please skip these if the domain is not familiar to you!

Example: The centralizer of a permutation $\sigma$ in $S_n$ consists of all permutations that preserve the cycle structure of $\sigma$ , of which there are $\prod_j j^{N_j}N_j!$ (where $N_j$ denotes the number of disjoint cycles of length $j$ in $\sigma$ .)

Express $\sigma$ in disjoint cycle notation so that we can take an arbitrary cycle $(a_1~a_2~\ldots~a_n)$ .

If $\tau$ is in the centralizer of $\sigma$ , then it commutes with $\sigma$ . In particular, for each $a_i$ , we have $\tau(\sigma(a_i))=\sigma(\tau(a_i))$ , which simplifies to $\tau(a_{i+1})=\sigma(\tau(a_i))$ . Because this equation implies that $\sigma$ takes $\tau(a_i)$ to $\tau(a_{i+1})$ , we know that $\tau(a_i)$ and $\tau(a_{i+1})$ must be in the same cycle. Using the same argument with $\tau^{-1}$ (which also commutes with $\sigma$ ), we know $\tau^{-1}(a_i)$ and $\tau^{-1}(a_{i+1})$ must be in the same cycle.

If $\tau$ takes elements in the same cycle to the same cycle, and the preimage $\tau^{-1}$ also takes elements of the same cycle to the same cycle, then any $\tau$ in the centralizer of $\sigma$ simply permutes the elements within each cycle of $\sigma$ . This means each cycle in $\sigma$ is mapped to a cycle of the same length, thus preserving the cycle structure of $\sigma$ .

The number of such $\tau$ is the number of such permutations, which is the product of the number of permutations of each cycle of $\sigma$ . If there are $N_j$ cycles of length $j$ in $\sigma$ , then the number of $\tau$ (the size of the centralizer of $\sigma$ ) is equal to $|C_{S_n}(\sigma)|=\prod_j j^{N_j}N_j!$

Corollary: The size of the conjugacy class of a permutation $\sigma$ in $S_n$ is $|G|/|C_G(\sigma)|=n!/\prod_j j^{N_j}N_j!$ (where $N_j$ denotes the number of disjoint cycles of length $j$ in $\sigma$ .)
Example: The centralizer of a diagonalizable matrix $M$ in the group of invertible $n\times n$ matrices under matrix multiplication consists of all matrices block-diagonalizable in the same eigenbasis of $M$ , of which there are $\prod_j |F|^{n_j^2-n_j}$ (where $n_j$ denotes the multiplicity of the $j$ th distinct eigenvalue, and $|F|$ denotes the size of the field that the matrices are defined over.)

A diagonalizable matrix $M$ is one that can be represented as $M=PDP^{-1}$ where $D$ is a diagonal matrix and the columns of $P$ form an eigenbasis of $M$ . To find the centralizer, we’re trying to find all matrices $A$ that commute with $PDP^{-1}$ , i.e. $APDP^{-1}=PDP^{-1}A$

Let $A=PBP^{-1}$ for some $B$ . Then we have $PBP^{-1}PDP^{-1}=PDP^{-1}PBP^{-1}$ which simplifies to $BD=DB$

Thus $B$ must commute with the diagonal matrix $D$ . By definition of matrix multiplication, $(BD)_{ij}$ must be equal to $\sum_k B_{ik}D_{kj}=B_{ij}D_{jj}$ , since $D$ is diagonal meaning $D_{kj}$ is zero everywhere except when $k=j$ . Likewise, $(DB)_{ij}=\sum_k D_{ik}B_{kj}=D_{ii}B_{ij}$ . Since $BD=DB$ we get $(BD)_{ij}=(DB)_{ij}\implies B_{ij}D_{jj}=D_{ii}B_{ij}\implies B_{ij}D_{jj}=B_{ij}D_{ii}$ .

$B_{ij}D_{jj}=B_{ij}D_{ii}$ is trivially true if $i=j$ . For $i\ne j$ , it means that $B_{ij}$ is nonzero only if $D_{ii}=D_{jj}$ . This means $B$ has nonzero entries only within blocks where the eigenvalues $D_{ii}$ are equal. In other words, $B$ is block-diagonal, with each block corresponding to the eigenspace of each distinct eigenvalue.

$B$ is block-diagonal, so $A=PBP^{-1}$ is block-diagonalizable in the same eigenbasis $P$ of $M$ . Therefore, the matrices $A$ that commute with $M$ are exactly the ones block-diagonalizable in the same eigenbasis of $M$ .

Thus the size of the centralizer of $M$ can be computed as the product of the number of possible matrices for each block in the block diagonal form of $B$ .

Then we can count then number of matrices by counting the number of possible $B$ . Each $n\times n$ block in $B$ has $n^2-n$ entries (all entries but the diagonal) that can be filled with any value, therefore each $n\times n$ block contributes $|F|^{n^2-n}$ possibilities where $|F|$ denotes the size of the field that the matrices are defined over. Blocks are determined by the multiplicity of eigenvalues of $M$ , thus we take the product $\prod_j |F|^{n_j^2-n_j}$ where $n_j$ denotes the multiplicity of the $j$ th distinct eigenvalue.
Corollary: The size of the conjugacy class of a diagonalizable matrix $M$ in the group of invertible $n\times n$ matrices under matrix multiplication is $|G|/|C_G(M)|=(\prod_{i=0}^{n-1} |F|^n-|F|^i)/(\prod_j |F|^{n_j^2-n_j})$ (where $n_j$ denotes the multiplicity of the $j$ th distinct eigenvalue, and $|F|$ denotes the size of the field that the matrices are defined over.)

In this section, we learn about the relationship between conjugacy classes and central elements.

Here’s an easy-to-prove fact:

Theorem: Central elements are exactly the elements that are invariant under conjugation.

Being central means $gz=zg$ for every $g\in G$ , and being invariant under conjugation means $z=gzg^{-1}$ for every $g\in G$ . But $gz=zg\iff z=gzg^{-1}$ .

Corollary: The conjugacy class $[z]$ of a central element $z$ is always a singleton set, and the representative $z$ of a singleton conjugacy class $[z]$ is a central element.

The conjugacy class of $z$ contains all elements conjugate to $z$ . But since $z$ is invariant under conjugation, as we just proved, the only element conjugate to $z$ is $z$ itself. Therefore $z$ is in a singleton conjugacy class.

Conversely, if $z$ is in a singleton conjugacy class, it means only $z$ is conjugate to $z$ , therefore $z$ is invariant under conjugation, therefore central.

In particular, the identity element $e$ (which is always central) is always in a singleton conjugacy class.

An important result that arises from this is that we can split a group’s conjugacy classes into two types: the central conjugacy classes (which are the singleton conjugacy classes) and the non-central conjugacy classes. This relationship is described below: $|G|=|Z(G)|+\sum_i|[g_i]|$ where $|[g_i]|$ denotes the size of the $i$ th non-central conjugacy class. Using what we learned earlier, we can rewrite this as $|G|=|Z(G)|+\sum_i|G|/|C_G(g_i)|$

The result above is known as the class equation of a group, and is used in many number-theoretic proofs about the center. For example:
Theorem: Every group with prime power order (a $p$ -group) has a non-trivial center, whose order is divisible by $p$ .

$p$ -groups are of prime power order, so $|G|=p^n$ where $n\ge 1$ .

Using the fact that $|[g_i]|=|G|/|C_G(g_i)|$ , we know that the size of the conjugacy class of $g_i$ must be a factor of $|G|$ , i.e. it must be a prime power $p^{k_i}$ where $k_i\le n$ .

Then we can write the class equation as $p^n=|Z(G)|+\sum_i p^{k_i}$ Since the LHS is divisible by $p$ , so must the RHS. The sum $\sum_i p^{k_i}$ is divisible by $p$ , because each of its terms is divisible by $p$ . Then in order for the RHS as a whole to be divisible by $p$ , $|Z(G)|$ must also be divisible by $p$ .

But if $|Z(G)|$ is divisible by $p$ then it is not $1$ , therefore the center is non-trivial.
In this section, we learn about subgroups.

A subgroup of a group $G$ is a subset of $G$ that is also a group. In other words, it is a subset of $G$ that includes identity and is closed under product and inverse.

Every element $g$ generates a subgroup $\<g\>$ by taking powers of itself: $\<g\>=\{\ldots,g^{-3},g^{-2},g^{-1},e,g,g^2,g^3,\ldots\}$ . We already know this as the cyclic group generated by $g$ . You can also generate subgroups from multiple elements by taking all products and inverses: $\<g,h\>=\{e,g,h,g^2,gh,hg,h^2,g^3,\ldots\}$

In fact, some of the subsets we’ve been working with are actually subgroups, as proven below:
Theorem: The center $Z(G)$ is a subgroup of $G$ .

The identity $e$ is always central, so $e\in Z(G)$ .

The product of two central elements $g_1,g_2\in Z(G)$ is central.

Proof: $(g_1g_2)h=g_1hg_2=h(g_1g_2)$ for all $h\in G$ , thus $g_1g_2\in Z(G)$ .

The inverse of a central element $g\in Z(G)$ is central.

Proof: $\begin{aligned}gh&=hg\\g^{-1}ghg^{-1}&=g^{-1}hgg^{-1}\\hg^{-1}&=g^{-1}h\end{aligned}$ for all $h\in C_G(g)$ , thus $h^{-1}\in C_G(g)$ .

Thus $Z(G)$ forms a group, and is a subgroup of $G$ .
Theorem: The centralizer $C_G(S)$ is a subgroup of $G$ .

Recall that to be in the centralizer, every $g\in C_G(S)$ must commute with $S$ .

$C_G(S)$ has identity: Clearly $e\in C_G(S)$ because $eS=S=Se$ .

$C_G(S)$ has inverses: If $g\in C_G(S)$ then for all $h\in G$ we have $hg=gh$ , which is the same as saying $gh^{-1}=h^{-1}g$ , therefore $g^{-1}\in C_G(S)$ .

$C_G(S)$ has products: If $g,h\in C_G(S)$ , then for all $k\in G$ we have $gk=kg$ and $hk=kg$ . Then $(gh)k=gkh=k(gh)$ , thus $gh\in C_G(S)$ .

Since it has identity, inverses, and products, the centralizer $C_G(S)$ is always a subgroup of $G$ .
Theorem: The intersection of two subgroups is a subgroup of both.

Both subgroups contain $e$ , so their intersection contains $e$ .

Both subgroups are closed under product, so their intersection is closed under product.

To make this more clear, this is because the product of two elements that are contained in both subgroups must remain contained in both subgroups by definition of subgroup.

Both subgroups are closed under inverses, so their intersection is closed under inverses.

Since the intersection is a subset of both given subgroups, contains identity, and is closed under product and inverses, it is a subgroup of both.
Since the center is the intersection of all centralizers of a group, the above theorem provides an alternate proof that the center is a subgroup (using the fact that all centralizers are subgroups.)

In this section, we make a group the least abelian it can be.

To make a group the least abelian it can be, we want it to have the smallest possible center. A group whose center is trivial is called centerless.

One way to make the center of a group smaller is to map every central element $z\in Z(G)$ to the identity element $e$ . Let $\pi$ be this map.

The problem is, if $az=b$ , then when we map $z$ to $e$ we get $a=b$ under $\pi$ . So when defining $\pi$ , not only do we need to map central elements to $e$ , we need to collapse elements differing by some $z$ into the same element. How do we achieve this definition? Equivalence relations!

Say $a\sim b$ when $a,b$ differ by a central element, so that $a,b$ are in the same equivalence class exactly when they differ by a central element (and therefore $a=b$ under $\pi$ .) Then as usual, $\sim$ partitions $G$ into equivalence classes $[g]$ , and as each class represents a distinct element under $\pi$ , we can define $\pi$ as sending every element $g\in G$ to its corresponding equivalence class $\pi(g)$ . This accomplishes the goal of mapping central elements to an identity element (the equivalence class of $e$ ), while collapsing elements differing by some central elements to the same equivalence class.

Well, that’s the plan, anyways. We don’t yet have any idea what the structure of $\pi(g)$ is, much less a guarantee that it’s a group.

To ensure it’s a group, note that for $a,b$ to differ by a central element, we can say $az=b$ for some $z\in Z(G)$ . Let’s shorten that condition to $b\in aZ(G)$ . To explain: $aZ(G)$ contains every element that differs from $a$ by some central element, and $b$ is one of them iff $a,b$ differ by some central element.

But the equivalence relation $a\sim b\iff b\in aZ(G)$ just checks for set membership. In other words, we have essentially $b\in [a]\iff b\in aZ(G)$ , meaning that we can identify $aZ(G)$ itself as the equivalence class $[a]$ ! Thus the following manipulations are valid: $g[a]=g(aZ(G))=(ga)Z(G)=[ga]$ $[a][b]=aZ(G)bZ(G)=(ab)Z(G)=[ab]$ $[g]^{-1}=Z(G)^{-1}g^{-1}=g^{-1}Z(G)=[g^{-1}]$ Note that this is only possible since $Z(G)$ commutes with every element by definition of being the center, and that $Z(G)$ is a subgroup of $G$ : $Z(G)Z(G)=Z(G)$ because subgroups are closed under product and $Z(G)^{-1}=Z(G)$ because subgroups are closed under inverse.

Armed with these manipulations and the knowledge that $\pi(a)=[a]=aZ(G)$ , let’s ensure that the image of $\pi$ is indeed a group.
- Identity: $\pi(e)=[e]$ satisfies the identity laws $[g][e]=[ge]=[g]=[eg]=[e][g]$ .
- Closed under product: $\pi(g)\pi(h)=[g][h]=[gh]=\pi(gh)$ .
- Closed under inverses: $\pi(g)^{-1}=[g]^{-1}=[g^{-1}]=\pi(g^{-1})$ .
So $\im\pi$ is indeed a group!

Above, we used a number of mechanisms to arrive at a new group by sending the center to $e$ . Here’s a summary:
- First, we defined an equivalence relation predicated on differing by a central element.
- We used this equivalence relation to partition the group into equivalence classes $[g]$ .
- By expressing the equivalence relation as set membership, we found that each of the equivalence classes $[g]$ is exactly the set $gZ(G)$ .
- Using properties of the center, we proved that the set of all $gZ(G)$ for all $g\in G$ form a group.
This overall operation, of sending a subgroup to $e$ to obtain a new group, is called quotienting the group $G$ by the subgroup $Z(G)$ . The resulting group of equivalence classes is known as a quotient group, and in this case, it is denoted $G/Z(G)$ . (reads as “ $G$ mod the center of $G$ .”)

This quotient group gives rise to one of the most efficient ways to tell if a group is abelian.

Theorem: $G$ is abelian iff $G/Z(G)$ is cyclic.

If $G$ is abelian, then $G=Z(G)$ by definition, so $G/Z(G)$ sends every element of $G=Z(G)$ to the identity $e$ . Thus $G/Z(G)$ is trivial, therefore cyclic. To show the other direction, assume $G/Z(G)$ is cyclic, generated by some generating equivalence class $[g]=gZ(G)$ .

Every element of the cyclic group $G/Z(G)$ is expressible as a power of the generator $(gZ(G))^k=g^kZ(G)$ . Each of these equivalence classes $g^kZ(G)$ consists of products of $g^k$ with each element in $Z(G)$ . Since the equivalence classes cover the entire group $G$ , every element of $G$ is expressible as some element $g^kz$ where $z\in Z(G)$ .

But then two arbitrary elements $g^{k_1}z_1,g^{k_2}z_2$ must commute via $(g^{k_1}z_1)(g^{k_2}z_2)=z_2g^{k_1}g^{k_2}z_1=z_2g^{k_1+k_2}z_1=z_2g^{k_2}g^{k_1}z_1=(g^{k_2}z_2)(g^{k_1}z_1)$ thus $G$ is abelian.

In this section, we formalize quotient groups.

In general, given a subgroup $H\le G$ , we can attempt to find the quotient group $G/H$ by doing the following:

Since we want to send all elements of $H$ to $e$ , we want to consider two elements equivalent if they differ by any element $h\in H$ . The idea is that after mapping $H$ to $e$ , every pair of elements that differ by an element in $H$ now differ by $e$ , and are therefore the same element. Define $\sim$ so that $a\sim b$ whenever $a,b$ differ by an element in $H$ . Like before, we can express this relation more succinctly: $\begin{aligned} a\sim b\iff&a,b\text{ differ by some }h\in H\\ \iff&ah=b&\text{ for some }h\in H\\ \iff&h=a^{-1}b&\text{ for some }h\in H\\ \iff&a^{-1}b\in H\\ \iff&b\in aH \end{aligned}$ the idea being that $aH$ contains all elements that differ from $a$ by a factor of some $h\in H$ , and if $b$ is one of them, then $a,b$ differ by some $h$ as required.

Is $\sim$ a equivalence relation? Let’s see:
- Reflexivity: $a\in aH$ since $H$ contains $e$ , and therefore $aH$ contains $ae=a$ .
- Symmetry: Given $b\in aH$ , we know $a\in bH$ since $\begin{aligned} &b\in aH\\ \iff&a^{-1}b\in H\\ \iff&b^{-1}a\in H&\text{ since }H\text{ is closed under inverse}\\ \iff&a\in bH \end{aligned}$
- Transitivity: Given $b\in aH$ and $c\in bH$ , we will show $c\in aH$ : $\begin{aligned} &c\in bH\text{ and }b\in aH\\ \iff&b^{-1}c\in H\text{ and }a^{-1}b\in H\\ \iff&(a^{-1}b)(b^{-1}c)\in H&\text{ since }H\text{ is closed under product}\\ \iff&a^{-1}c\in H\\ \iff&c\in aH\\ \end{aligned}$
Since $\sim$ is an equivalence relation, we can use it to partition $G$ into equivalence classes $[g]$ . Like before, we note that $b\in [a]$ iff $a\sim b$ iff $b\in aH$ , so each equivalence class $[a]$ is exactly $aH$ .

Now, do these equivalence classes form a group? Let’s see:
Here we encounter a problem: unlike the center $Z(G)$ , the subgroup $H$ doesn’t commute with all elements, and therefore $aHbH$ is not equal to $abH$ in general.

Contains identity: $[e]$ satisfies $[e][g]=[eg]=[g]=[ge]=[g][e]$ , and is therefore the identity.

Contains product: We need to show that the product $[a][b]$ yields another equivalence class. $[a][b]=(aH)(bH)=aHbH\ne abH=[ab]$
Recall that a subset $S$ of $G$ commutes with all elements of $G$ exactly when $S$ is invariant under conjugation. So we need the subgroup $H$ to be invariant under conjugation, and then we can proceed with showing that the equivalence classes $aH$ form a group by letting $H$ commute with every element.
Therefore: If $H\le G$ is invariant under conjugation, then the equivalence classes of $G$ under the relation $a\sim b$ iff $b\in aH$ form a group.

Contains identity: $[e]$ satisfies $[e][g]=[eg]=[g]=[ge]=[g][e]$ , and is therefore the identity.

Contains product: $[a][b]=(aH)(bH)=aHbH=abHH=abH=[ab]$ so the product of two equivalence classes $[a],[b]$ is $[ab]$ , which exists because $ab\in H$ by closure under product in $H$ .

Contains inverse: $[g]^{-1}=(gH)^{-1}=H^{-1}g^{-1}=g^{-1}H=[g^{-1}]$ so the inverse $[g^{-1}]$ of $[g]$ exists because $g^{-1}\in H$ by closure under inverse in $H$ .
Theorem: If a subgroup $H$ has unique order, then it is normal.

First, note that the conjugate of a subgroup $gHg^{-1}$ is a subgroup of the same order:

Contains identity: $e=geg^{-1}\in gHg^{-1}$

Absorbs product: $(gHg^{-1})(gHg^{-1})=gHHg^{-1}=gHg^{-1}$

Absorbs inverse: $(gHg^{-1})^{-1}=gH^{-1}g^{-1}=gHg^{-1}$

Same order: the map $h\mapsto ghg^{-1}$ is a map $H\to gHg^{-1}$ and is a bijection since it has an inverse $h\mapsto g^{-1}hg$ . Thus $|H|=|gHg^{-1}|$ .

Since the conjugate of a subgroup must have the same order but $H$ has unique order, $H$ must be invariant under conjugation, thus normal.
When a subgroup $H\le G$ is invariant under conjugation, we call it a normal subgroup, and denote it by $H\lhd G$ . And as we’ve shown, sending the elements of $H$ to $e$ forms a quotient group $G/H$ if and only if $H$ is a normal subgroup. The elements of every quotient group, the equivalence classes $aH$ , are called cosets of $H$ . So the above result can be written as:

Theorem: $H\lhd G$ iff the cosets of $H$ form a quotient group $G/H$ .

This enshrines the importance of normal subgroups – they are exactly the subgroups $H$ that you can send to $e$ to construct a quotient group. We already know of such a subgroup: the center of a group $Z(G)$ is always a normal subgroup of $G$ . This makes sense since $Z(G)$ is made up of elements that commute with every element in $G$ , so it must be invariant under conjugation. In fact any element of the center must generate a normal subgroup:

Theorem: Central elements generate normal subgroups.

Given that central elements $z\in Z(G)$ are invariant under conjugation, it follows that any product of a central element with itself is also invariant under conjugation. This means the subgroup $\<z\>$ generated by $z$ is necessarily invariant under conjugation in $G$ , and therefore a normal subgroup of $G$ .

Given $H$ a subgroup, $aH$ is a coset even if $H$ is not a normal subgroup — being a normal subgroup just means the cosets form a group. Nevertheless, the fact that you can make cosets out of any subgroup leads directly to a very important theorem:

Lagrange’s Theorem: The order of every subgroup divides the order of the group.

For any subgroup $H\le G$ , consider its cosets $aH$ . There is a bijection between any two cosets $aH$ and $bH$ : left-multiply $aH$ by $ab^{-1}$ to get $bH$ . It’s a bijection because the inverse is left-multiplying by $ba^{-1}$ . The existence of a bijection between every coset implies that every coset has the same size, and in particular they are all the same size as $eH=H$ , which is $|H|$ .

Since cosets are equivalence classes, they form a partition of the group $G$ . But since this partitions $G$ into equally-sized equivalence classes of size $|H|$ , $|H|$ must divide $|G|$ .

Corollary: The order of $G/H$ is $|G|/|H|$ .

We showed earlier that the cosets partition $G$ . Since the cosets are all the same size, $|H|$ , the number of such cosets must be $|G|/|H|$ . But the cosets are exactly the elements of the quotient $G/H$ , so the order of $G/H$ is $|G|/|H|$ .

We proved earlier that conjugacy classes are subsets of $G$ that are invariant under conjugation. Conjugacy classes have a close relationship with normal subgroups:

Theorem: Normal subgroups are exactly subgroups that are a union of conjugacy classes.

It is easy to see that if a subgroup is a union of conjugacy classes, then it is normal, since union preserves invariance under conjugation.

To show the converse, let $H\lhd G$ be a normal subgroup of $G$ and consider an arbitrary element $h\in H$ . Since $H$ must be invariant under conjugation, conjugating $h$ must give another element in $H$ . That is, $H$ contains all conjugates of $h$ , which means $H$ contains the conjugacy class $[h]$ . Since this is true for arbitrary $h\in H$ , $H$ fully contains the conjugacy class of each of its elements, meaning $H$ must be a union of conjugacy classes.

This theorem gives us a method to determine whether a subgroup is normal beyond checking for invariance under conjugation. All we need to do is find the conjugacy classes — then you can obtain every normal subgroup of the group by taking unions of conjugacy classes, and checking which unions form a subgroup.

For instance:
Example: $\{e\}$ , $Z(G)$ , and $G$ are always normal subgroups of $G$ .

We already know these three are subgroups of $G$ . To prove that they are normal, note that they are all unions of conjugacy classes:

$e$ is central and therefore is only conjugate to itself, so it is in a singleton conjugacy class.

$Z(G)$ consists of central elements and by the same argument, is a union of singleton conjugacy classes.

$G$ trivially contains all the conjugacy classes in $G$ .

Since all three subgroups are unions of conjugacy classes, they must be normal.
In this section, we make a group abelian.

Is there a way to quotient a group $G/H$ such that the resulting group is abelian?

To do this, we first look at the equation $gh=hg$ . If we move everything to one side, we get $g^{-1}h^{-1}gh=e$ . This tells us that if $g^{-1}h^{-1}gh$ is the identity element $e$ , then $g$ and $h$ commute. The element $g^{-1}h^{-1}gh$ is called the commutator of $g$ and $h$ , and is written $[g,h]$ .

(Note that $ghg^{-1}h^{-1}$ is also a commutator of $g$ and $h$ . We want to stick to one definition, so let’s use the first one, $g^{-1}h^{-1}gh$ .)

What if the only commutator in the group is the identity $e$ ? That means no matter what $g$ and $h$ you pick, their commutator $[g,h]$ is $e$ and therefore they commute. So if the only commutator in the group is $e$ , the group is abelian.

The idea here is that if we can send all the commutators to $e$ , the resulting group will be abelian. But to do this, we need to make sure that the commutators form a normal subgroup $H\lhd G$ , so that the quotient $G/H$ sends $H$ to $e$ . Do the commutators form a normal subgroup?

The commutators almost form a subgroup. The identity element is a commutator. The inverse of a commutator $[g,h]^{-1}$ is $[h,g]$ , a commutator. But the group product of two commutators need not be a commutator.

We can solve this problem by having the commutators generate a subgroup, i.e. include all their products as part of the subgroup. Let $G'$ be the subgroup generated by all the commutators. We call $G'$ the commutator subgroup, and much later we will see why it is also called the derived subgroup. For now, let’s prove that it is normal:

Theorem: The commutator subgroup $G'$ is a normal subgroup of $G$ .

WTS the conjugate of arbitrary $[g,h]\in G'$ is also a commutator. We can prove this directly.

$\begin{aligned} &k[g,h]k^{-1}&\text{the conjugate of }[g,h]\\ =&k(ghg^{-1}h^{-1})k^{-1}&[g,h]\text{ is a commutator }ghg^{-1}h^{-1}\\ =&(kgk^{-1})(khk^{-1})(kg^{-1}k^{-1})(kh^{-1}k^{-1})&\text{distribute}\\ =&(kgk^{-1})(khk^{-1})(kgk^{-1})^{-1}(khk^{-1})^{-1}\\ =&[kgk^{-1},khk^{-1}] \end{aligned}$

The conjugate of an arbitrary commutator in $G'$ is a commutator, so $G'$ is invariant under conjugation, and therefore a normal subgroup.

Now if we take the quotient $G/G'$ , we send all the commutators to $e$ and are left with a group whose only commutator is $e$ , which is an abelian group. This quotient $G^{ab}\equiv G/G'$ is called the abelianization of $G$ .

This easily extends to quotienting by any normal subgroup containing $G'$ as well.
Theorem: A quotient $G/N$ is abelian iff $N$ includes $G'$ .

( $\to$ ) If $G/N$ is abelian, its only commutator is $e$ , which means any commutator of $G$ was sent to $e$ after quotienting by $N$ , which means $N$ includes the commutators of $G$ .

( $\from$ ) If $N$ includes all the commutators of $G$ , it means $G/N$ sends all the commutators to $e$ , which means $G/N$ is abelian.
In this section, we review abelian and centerless groups.

Abelian groups and centerless groups are both subclasses of groups. Moreover, you can study the abelian part and the centerless part separately; simply quotient by the commutator subgroup $G'$ to get the abelian part, and quotient by the center $Z(G)$ to get the centerless part. Doing both gives you the trivial group $\{e\}$ , since it is the only group that is both abelian and centerless.

In this section, we learn some more ways to find normal subgroups.

So far, we’ve encountered a few normal subgroups that always exist for a group $G$ :
- The trivial subgroup $\{e\}$ is always a normal subgroup.
- The center $Z(G)$ is a normal subgroup.
- The group $G$ is itself a normal subgroup of $G$ .
- The commutator subgroup $G'$ is a normal subgroup.
We also found that a subgroup is normal iff it is a union of conjugacy classes, but that requires finding the conjugacy classes of a group. Are there any other shortcuts?

Here are a few other shortcuts:

Theorem: Every subgroup of an abelian group is normal.

Since $gh=hg$ implies $g=hgh^{-1}$ for every $g,h\in G$ , every element of an abelian group is invariant under conjugation, and therefore so is every subgroup. So every subgroup of an abelian group is normal.
Theorem: For finite groups, a subgroup with unique order is normal.

Lemma: In a finite group, the conjugate of a subgroup $gHg^{-1}$ is a subgroup with the same order.

We prove closure under group product, an identity element, and inverses exist for $gHg^{-1}$ .

Closure: The product of $gh_1g^{-1}$ and $gh_2g^{-1}$ is $g(h_1h_2)g^{-1}$ , so $gHg^{-1}$ is closed under group product.

Identity: $geg^{-1}=e$ shows $H$ contains $e$ .

Inverse: $(ghg^{-1})^{-1}=gh^{-1}g^{-1}$ shows $H$ has inverses.

Therefore $gHg^{-1}$ is indeed a subgroup. To show that it has the same order, notice that product and inverse above are the same as in the original group, except with the $g,g^{-1}$ around them. This means all elements in the subgroup were merely renamed, which is a bijection $H\leftrightarrow gHg^{-1}$ .

Since the group is finite and a bijection exists between the subgroup and its conjugate, they must have the same order.

Since $H$ has unique order, we necessarily have $gHg^{-1}=H$ , which by definition means $H$ is invariant under conjugation, i.e. normal.
Theorem: The product $HK$ of two subgroups $H,K\lhd G$ is a subgroup iff at least one of $H,K$ is normal.

Let $h\in H$ and $k\in K$ , then we can show that $HK$ is a subgroup:

Product is preserved: $(hk)(h'k')=hk(k^{-1}h''k)k'=(hh'')(kk')\in HK$ . This uses the fact that $kh'=h''k$ , assuming that $H$ is normal. The same argument can be made with $kh'=h'k''$ when $K$ is normal.

Inverse is preserved: $(hk)^{-1}=k^{-1}h^{-1}=k^{-1}(kh'k^{-1})=h'k^{-1}\in HK$ .

Contains identity: Being subgroups, both $H,K$ contain $e\in G$ , thus $HK$ also contains $e$ .
Theorem: The product $HK$ of two normal subgroups $H,K\lhd G$ is normal.

Since $H,K$ are normal, $HK$ is a subgroup. To show $HK$ is normal, consider $g\in G$ . The conjugate of $hk$ by $g$ is $ghkg^{-1}$ , which is equal to $(ghg^{-1})(gkg^{-1})$ . Since $H$ and $K$ are themselves normal, this is equal to some element $h'k'\in HK$ , so $HK$ is invariant under conjugation.

Theorem: The intersection $H\cap K$ of two normal subgroups $H,K\lhd G$ is a normal subgroup of both.

We know that the intersection of two subgroups is a subgroup of both. To show that the subgrouping is a normal subgrouping, note that if all conjugates of an element in $H$ is in $H$ , and all conjugates of an element in $K$ is in $K$ , then all conjugates of an element in both $H$ and $K$ are in both $H$ and $K$ .

In this section, we measure the normality of a subgroup.

Recall that in the very beginning, we noted that the set of all elements that commute with every element is somewhere between “just the identity element” and “every element.”

Similarly, the set of all elements that commute with some $g$ (the centralizer of $g$ ) is somewhere between $\<g\>$ and “every element.”

We can apply a similar approach to subgroups regarding normality. Recall that a subgroup $H\le G$ is normal iff every element $g\in G$ commutes with it: $gH=Hg$ . That is to say, iff the set of elements $g\in G$ satisfying $gH=Hg$ consists of the entire group. Call this set the normalizer of $H$ , denoted $N_G(H)$ .

Theorem: Every subgroup $H$ is a normal subgroup of its normalizer $N_G(H)$ .

The normalizer $N_G(H)$ is the set of elements $g\in G$ satisfying $gH=Hg$ . But that is exactly the condition $gHg^{-1}=H$ , i.e. $H$ is invariant under conjugation under elements in the normalizer, i.e. $H$ is normal in $N_G(H)$ .

Then a subgroup $H\le G$ is normal in $G$ when its normalizer $N_G(H)$ is “every element,” i.e. equal to $G$ . That’s the maximum; what’s the minimum? The set of elements $g$ such that $gH=Hg$ must always include the elements $h\in H$ , since $hH=H=Hh$ . So at minimum, $N_G(H)=H$ meaning $H$ is self-normalizing, and at maximum, $N_G(H)=G$ i.e. $H\lhd G$ .

Note these parallels between the centralizer and the normalizer of a subgroup:
- A subgroup $H\le G$ is central in its centralizer: $H\subseteq Z(C_G(H))$ . Adding more elements to the centralizer would make $H$ not central, so the centralizer is the ‘maximum’ superset of $H$ that makes $H$ central in it. If the centralizer of $H$ is $G$ , then $H$ is central in $G$ .
- A subgroup $H\le G$ is normal in its normalizer: $H\lhd N_G(H)$ . Adding more elements to the normalizer would make $H$ not normal, so the normalizer is the ‘maximum’ superset of $H$ that makes $H$ normal in it. If the normalizer of $H$ is $G$ , then $H$ is normal in $G$ .
These statements hold true for arbitrary subsets $S\subseteq G$ as well (but replace “normal” with “invariant under conjugation.”) Note that again, the centralizer and normalizer are always subgroups of $G$ , even when $S$ is not a subgroup of $G$ .

Like the centralizer, the normalizer is always a subgroup:
Theorem: The normalizer $N_G(S)$ is a subgroup of $G$ .

Identity: Clearly $e\in N_G(S)$ because $eS=S=Se$ .

Inverses: If $g\in N_G(S)$ we have $gS=Sg$ , which is the same as saying $Sg^{-1}=g^{-1}S$ , therefore $g^{-1}\in N_G(S)$ .

Product: If $g,h\in N_G(S)$ , we have $gS=Sg$ and $hS=Sh$ . Another way to write that is $gSg^{-1}=S$ and $hSh^{-1}=S$ . Substituting, we get $ghSh^{-1}g^{-1}=S$ , which is equivalent to saying $(gh)S=S(gh)$ , thus $gh\in N_G(S)$ .

Since it has identity, inverses, and product, $N_G(S)$ is always a subgroup of $G$ .
Recall that to arrive at the class equation: $|G|=|Z(G)|+\sum_i|G|/|C_G(g_i)|$ we had to use the fact that every conjugacy class is in the form $bC_G(g)$ for some $b\in G$ . This was because the equality $aga^{-1}=bgb^{-1}$ can be rewritten as $(b^{-1}a)g=g(b^{-1}a)$ meaning two conjugates of $g$ are equal every time the element $b^{-1}a$ commutes with $g$ , and the set of these elements $b^{-1}a$ are the centralizer $C_G(g)$ .

How many conjugates of a subgroup $H$ are there? In other words, what is the size of the conjugacy class of $H$ ? In this case, we are looking at the equality $aHa^{-1}=bHb^{-1}$ which can be rewritten as $(b^{-1}a)H=H(b^{-1}a)$ meaning two conjugates of $H$ are equal every time the element $b^{-1}a$ commutes with $H$ . The set of these elements $b^{-1}a$ are exactly the normalizer $N_G(H)$ . So distinct conjugates of $H$ are in one-to-one correspondence with the cosets of the form $b^{-1}aN_G(H)$ . Thus by the same logic as before (all cosets are the same size), the number of conjugates of $H$ is equal to $|G|/|N_G(H)|$ . TODO make these both theorems

TODO this directly shows that if normalizer is group, then H is invariant under conjugation since there’s only one conjugate

By studying the normalizer, we get an alternate way of identifying normal subgroups. Here’s how.

Given a subgroup $H$ , consider the partition of $G$ into cosets that look like $gH$ . Also consider the partition of $G$ into cosets that look like $Hg$ . To distinguish the two, the cosets $gH$ are called left cosets and the cosets $Hg$ are called right cosets. Note that an element $g$ is in the normalizer of $H$ iff $gH=Hg$ , i.e. the left coset $gH$ coincides with the right coset $Hg$ . If this is true of $g$ , it must be true of every element in its coset $gH$ . Because of this, the normalizer of $H$ is a union of cosets. More precisely, $N_G(H)$ is exactly the union of cosets common to both partitions of $G$ .

One consquence of the normalizer having elements $g\in N_G(H)$ that satisfy $gH=Hg$ is that they are exactly the elements that make $H$ invariant over conjugation: $gHg^{-1}=H$ . In other words, the subgroup $H$ is normal in the normalizer: $H\lhd N_G(H)$ . This means that we can take the quotient $N_G(H)/H$ .

An interesting fact arises when $H$ is a $p$ -subgroup. Recall that $p$ -groups are groups of prime power order $p^n$ . Similarly, a $p$ -subgroup is a subgroup of prime power order $p^n$ .

Theorem: For every $p$ -subgroup $H\le G$ , $|N_G(H)/H|\equiv |G/H|\pmod p$ .

Recall that in every quotient, like $N_G(H)/H$ , TODO left off here

the larger group $N_G(H)$ is partitioned into equally sized partitions with

Summary

We’ve learned that:
- The center of a group $Z(G)$ is basically a measure of how “commutative” a group is. If the center consists of the whole group, the group is abelian (the most commutative). If the center is trivial, the group is the least commutative it can be.
- Conjugacy classes are essentially subsets that commute. Subgroups formed by a union of conjugacy classes are commutative with the group and are called normal subgroups.
- By quotienting a normal subgroup (sending all its elements to $e$ ) we can either make a group centerless (by quotienting by $Z(G)$ , the center of the group) or we can make the group abelian (by quotienting by $G'$ , the commutator subgroup).

November 9, 2023. Exploration 2: Products
Let’s study subgroups. Let’s say $H$ and $K$ are subgroups of $G$ . How much of $G$ is captured by these two subgroups? In other words, how much of $G$ can you reconstruct from $H$ and $K$ alone?

One obvious answer is to combine the two subgroups. We can take the group product $HK$ , often just called the product, as all elements $hk$ for all $h\in H,k\in K$ . It is not necessarily a group. $HK=\{hk\mid h\in H, k\in K\}$ Of course this only works because $H$ and $K$ are subgroups of the same group, which gives us the definition of a product between elements of $H$ and $K$ . If $H$ and $K$ were arbitrary groups, we would have to come up with a new definition of product between them. More on that later.

$HK$ contains $H$ , and it also contains $K$ . To see this, remember that the identity $e\in G$ exists in both $H$ and $K$ , so $HK$ includes the elements $he=h$ for all $h\in H$ and $ek=k$ for all $k\in K$ . Thinking about group order, this means $HK$ is at least as large as $H$ and $K$ . And $HK$ is at most as large as $G$ , since elements of $HK$ come from $G$ and therefore are contained in $G$ . So: $|H|,|K|\le |HK|\le |G|$

Verify that when $K\subseteq H$ , $HK=H$ , and when $H\subseteq K$ , $HK=K$ . The question is, when is $HK=G$ ? In other words, when do $H$ and $K$ capture all the information about $G$ ? For this to be true, every element of $G$ must be expressible in the form $hk$ for some $h\in H,k\in K$ . Since there are $|H|$ choices for $h$ and $|K|$ choices for $k$ , there are $|H||K|$ choices for $hk$ , which must be at least $|G|$ in order to represent all of $G$ . So, firstly, in order for $HK=G$ to be true, we must have at the very least: $|H|,|K|\le |HK|\le |G|\le |H||K|$

Secondly, we would like to know how many of these $hk$ represent unique elements of $HK$ . To do this, we need to explore when two $hk$ are equal, and we can take their equivalence classes as the unique elements of $HK$ .
Theorem: For finite subgroups $H$ and $K$ , $|HK|=|H||K|/|H\cap K|$ .

Consider all pairs $(h,k)$ . There are $|H||K|$ of them, each corresponding to a product $h\cdot k\in HK$ . But not all products are necessarily distinct: $h\cdot k=hx\cdot x^{-1}k$ holds for every element $x\in H\cap K$ .

This means the equivalence class of pairs $(h,k)$ whose product is equal to $h\cdot k$ consists of $|H\cap K|$ pairs. Dividing the number of pairs $|H||K|$ by the size of each equivalence class $|H\cap K|$ gives the number of distinct products $|HK|$ .
Corollary: If $H$ and $K$ have trivial intersection, $|HK|=|H||K|$ .

So in order to have $HK=G$ , we can consider $|HK|=|H||K|=|G|$ when $H,K$ have trivial intersection.

Theorem: $HK=G$ exactly when $G$ has subgroups $H,K$ where $|H\cap K|=\{e\}$ and $|H||K|\ge |G|$ .

By the previous theorem $|HK|=|H||K|/|H\cap K|$ , when $|H\cap K|=\{e\}$ we get $|HK|=|H||K|$ . The assumption $|H||K|\ge|G|$ then gives $|HK|\ge|G|$ . But we always have $HK\subseteq G$ and therefore $|HK|\le |G|$ , thus $|HK|=|G|$ . Since all the $|HK|=|G|$ distinct elements in $HK$ can only come from $G$ , $HK=G$ .

Conversely, if $HK=G$ , then $|HK|=|G|$ and every element of $G$ is expressible by a unique pair $(h,k)$ for $h\in H$ and $k\in K$ , implying that $|H\cap K|=\{e\}$ using the same argument in the above theorem. This means $|HK|=|H||K|$ , which together with $|HK|=|G|$ implies $|H||K|=|G|$ , which is enough to show $|H||K|\ge |G|$ as well.

Proving that $|H||K|\ge |G|$ requires knowing $|H||K|$ and $|G|$ , which isn’t often the case, so let’s go in a different direction.

Whenever $H\cap K$ is trivial, we can make an argument about commutativity between elements $h\in H$ and $k\in K$ . In particular, if we can prove that their commutator $hkh^{-1}k^{-1}$ is in both $H$ and $K$ , then we know $hkh^{-1}k^{-1}=e$ and therefore all $h\in H$ commute with all $k\in K$ . When is this true?

Note that $hkh^{-1}k^{-1}\in H$ implies $kh^{-1}k^{-1}\in h^{-1}H=H$ , which exactly means that $H$ is invariant under conjugation by elements of $K$ . By a similar argument, $hkh^{-1}k^{-1}\in K$ implies $hkh^{-1}\in Kk=K$ which exactly states that $K$ is invariant under conjugation by elements of $H$ . We can make both statements true if we assert that $H$ and $K$ are invariant under conjugation by elements in $G$ — in other words, $H$ and $K$ are normal subgroups of $G$ .

Then the logic goes like this. If $H$ and $K$ are normal subgroups of $G$ , then $hkh^{-1}k^{-1}$ is in both $H$ and $K$ , implying that $hkh^{-1}k^{-1}\in H\cap K$ . If $H\cap K$ is trivial, then $hkh^{-1}k^{-1}=e$ meaning that every element of $H$ commutes with every element of $K$ . Then:

Theorem: When $G$ has normal subgroups $H,K$ with trivial intersection $|H\cap K|=\{e\}$ , then $H,K$ commute.

We can prove two facts about the commutator of arbitrary $h\in H$ and $k\in K$ . - Since $H$ is normal, $kh^{-1}k^{-1}\in H$ , so $h(kh^{-1}k^{-1})\in H$ . - Since $K$ is normal, $hkh^{-1}\in K$ , so $(hkh^{-1})k^{-1}\in K$ . Therefore the commutator is in both $H$ and $K$ . Since their intersection is trivial, the commutator must be the identity every time, implying that all elements of $H$ commute with all elements of $K$ .

In this section, we try a different approach to constructing a group from given subgroups.

If $H$ and $K$ are both normal with trivial intersection, their group product $HK$ is sometimes called the internal direct product.

In particular, if $|H\cap K|=\{e\}$ , then every single $hk$ is a unique element of $HK$ , and together with $|H||K|\ge |G|$ we get $|HK|=|H||K|=|G|$ . That means every pair $(h,k)$ for $h\in H$ and $k\in K$ can be identified with a unique element of $G$ . In fact, every element of the group $G$ can be written as a pair $(h,k)$ . In general, if two groups are renamings of each other, then they are isomorphic, and the renaming is an isomorphism. Here, we say that $G$ and $S=\{(h,k)\mid h\in H,k\in K\}$ are isomorphic (denoted $G\iso S$ ), and that the bijective renaming $g\mapsto (h,k)$ is an isomorphism.

This method of taking all pairs $(g,h)$ for $g\in G$ and $h\in H$ works for arbitary groups $G,H$ , and is known as the direct product of groups, and is denoted $G\times H$ . It is always a group because product and inverse can be defined componentwise, for example $(g_1,h_1)\cdot(g_2,h_2)=(g_1g_2,h_1h_2)$ .Sometimes the direct product is also called the direct sum, denoted $G\oplus H$ , but that is specifically for abelian groups whose group operation is denoted as addition ( $+$ ).
Theorem: $H\times K\iso G$ exactly when $G$ has subgroups $H,K$ where $|H\cap K|=\{e\}$ , $|H||K|=|G|$ , and $hk=kh$ for all $h\in H,k\in K$ .

Note that these three conditions are exactly the behavior of $H\times\{e\},\{e\}\times K$ in $H\times K$ :

$(h,e)=(e,k)$ implies $h=e,k=e$ , thus their intersection is trivial

$\<(h,e),(e,k)\>=G$ implies $HK=G$ and therefore $|H||K|=|G|$

$(h,e)(e,k)=(e,k)(h,e)$ is just $hk=kh$ .

That proves the forward direction. To show the reverse direction, we can construct this isomorphism directly with the given conditions.
A common misconception is that the direct product is kind of like the inverse of a group quotient. However, $H\times G/H\not\iso G$ in general.
Theorem: $G/H\times H\not\iso G$ .

Take the example $G=C_4=\<g\>$ .

Let $H=\<g^2\>\iso C_2$ . $H\lhd G$ since $H$ is cyclic, therefore abelian, therefore normal.

Then $G/H$ sends $g^2$ to $e$ , and the result is isomorphic to $C_2$ .

But $C_2\times C_2\not\iso C_4$ , since $C_4$ has an order $4$ element while $C_2\times C_2$ doesn’t.
Here are some other interesting theorems about direct product.

Theorem: $G\times H\iso H\times G$ .

There’s a simple bijection between each $(g,h)\in G\times H$ and $(h,g)\in H\times G$ (swap the elements), therefore the two groups are isomorphic.

Lemma: The order of a direct product $|G_1\times G_2\times\ldots\times G_n|$ is equal to $\prod_{i=1}^n|G_i|$ .

Proving this for $|G_1\times G_2|$ is enough to prove the theorem by induction. Since every element of $G_1\times G_2$ chooses an element $g_1\in G_1$ and an element $g_2\in G_2$ , there are $|G_1|$ possible choices for $g_1$ and $|G_2|$ possible choices for $g_2$ , so there are $|G_1||G_2|$ elements in $G_1\times G_2$ , as required.
Chinese Remainder Theorem: $C_m\times C_n\iso C_{mn}$ when $\gcd(m,n)=1$ (i.e. $m,n$ are coprime).

Given $C_m=\<a\>$ and $C_n=\<b\>$ , we want to show $C_{mn}=\<(a,b)\>$ .

$C_m\times C_n$ is generated by $(a,b)$ if every one of its elements $(a^i,b^j)\in C_m\times C_n$ can be written as $(a,b)^k$ for some $k$ .

This amounts to solving the system $\begin{aligned} k&\equiv i\mod m\\ k&\equiv j\mod n\\ \end{aligned}$ for $k\mod mn$ .

Chinese Remainder Theorem: In the above equations, $k$ has only one solution mod $mn$ .

This is a number-theoretic proof.

Bezout’s identity: if $\gcd(m,n)=1$ , then there exist integers $c,d$ such that $cm+dn=1$ .

Then $k$ has a solution: $\begin{aligned} k&=idn+jcm\\ k&=i(1-cm)+jcm&k&=idn+j(1-dn)&\\ k&=i-icm+jcm&k&=idn+j-jdn&\\ k&=i+(j-i)cm&k&=(i-j)dn+j&\\ k&\equiv i\mod m&k&\equiv j\mod n&\\ \end{aligned}$

To prove the solution unique mod $mn$ , consider another solution $k'$ where $k'\equiv i\mod m$ and $k'\equiv j\mod n$ .

But then $k'-k\equiv 0\mod m$ and $k'-k\equiv 0\mod n$ , therefore $k'-k$ must be a multiple of both $m$ and $n$ .

Since $\gcd(m,n)=1$ , $k'-k$ must be a multiple of $mn$ , therefore $k'-k\equiv 0\mod mn$ and $k'\equiv k\mod mn$ .

Then there is exactly one solution for $k$ for every pair $(i,j)$ . So we can write a bijection between each $(a^i,b^j)\in C_m\times C_n$ and the corresponding $(a,b)^k\in C_{mn}$ , which means the two groups are isomorphic.
The direct product discussed above is known as the external direct product $G\times H$ . If both groups are actually subgroups $H,K$ of some encompassing group $G$ , then we can define the internal direct product $HK$ , whose elements are all products $hk$ for $h\in H,k\in K$ . Unlike the external direct product, the internal direct product is restricted to working with elements of the encompassing group. Let’s explore the case when $G=HK$ :
Lemma: When $G=HK$ for normal subgroups $H,K\lhd G$ , then $H\cap K=\{e\}$ iff every $g\in G$ factors uniquely into the product of some $h\in H,k\in K$ .

( $\to$ )

$G=HK$ implies that every $g\in G$ can be written as $hk$ for some $h\in H,k\in K$ .

To prove that this factoring is unique given $H\cap K=\{e\}$ , let $g=h_1k_1=h_2k_2$ . Then: $\begin{aligned} g=h_1k_1&=h_2k_2\\ h_2^{-1}h_1&=k_2k_1^{-1}\\ h_2^{-1}h_1&=k_2k_1^{-1}\in H\cap K\text{ (since }h_2^{-1}h_1\in H\text{ and }k_2k_1^{-1}\in K\text{)}\\ h_2^{-1}h_1&=k_2k_1^{-1}=e\\ h_1=h_2&\text{ and }k_2=k_1 \end{aligned}$ shows that any two such factorings are the same.

( $\from$ )

Given that $g=hk$ is a unique factoring for all $g\in G$ , let $x$ be an arbitrary element in $H\cap K$ .

Then $x$ has two factorings $x=x\cdot e=e\cdot x$ , which must be the same due to the unique factoring in $G$ . Then $x=e$ , implying the only element in the intersection $H\cap K$ is $e$ .
Internal Direct Product Theorem: For a group $G$ with normal subgroups $H,K\lhd G$ whose intersection is trivial and $G=HK$ , if $|G|=|H||K|$ , then $HK\iso H\times K$ .

By the previous theorem, $H\cap K=\{e\}$ and $G=HK$ implies any element $g\in G$ can be factored uniquely into $hk$ where $h\in H,k\in K$ . This unique factorization essentially maps each $g=hk\in G$ to a distinct pair $(h,k)$ . When $|G|=|H||K|$ , this describes an isomorphism $G=HK\iso H\times K$ .

In this section, we study the semidirect product.

For arbitrary subgroups $H,K$ of $G$ , recall that $HK$ is not necessarily a subgroup.

Theorem: The product of two subgroups of $G$ is a not in general a subgroup of $G$ .

The typical counterexample is $\<h\>\<k\>$ with no relations between $h$ and $k$ (each string consisting of copies of $h,k$ represents a unique element.) It cannot be closed under inverses since $(hk)^{-1}=k^{-1}h^{-1}$ , and $k^{-1}h^{-1}$ is distinct from any $h^ak^b\in\<h\>\<k\>$ .
Theorem: The product $HK$ of two subgroups $H,K\le G$ is a subgroup of $G$ if either $H$ or $K$ is normal.

Both subgroups contain $e$ , so their product contains $e$ .

Both subgroups $H,K$ are closed under product. To see this, take an arbitrary element $(h_1k_1)(h_2k_2)\in HK$ . Since either $H$ or $K$ is normal, $k_1h_2$ is equal to some $h_2k_1'$ (if $K$ is normal) or some $h_2'k_1$ (if $H$ is normal). Either way, we obtain some $(h_1h_2')(k_1'k_2)$ which is in $HK$ , so $HK$ is closed under product.

Both subgroups are closed under inverses. To see this, take an arbitrary element $hk\in HK$ , whose inverse is $(hk)^{-1}=k^{-1}h^{-1}$ . Since either $H$ or $K$ is normal, $k^{-1}h^{-1}$ is equal to some $h^{-1}k'$ (if $K$ is normal) or some $h'k^{-1}$ (if $H$ is normal). Either way, we obtain some $h'k'$ which is in $HK$ , so $HK$ is closed under inverse.

Since the product $HK$ contains identity, and is closed under product and inverses, it is also a subgroup of $G$ .
Essentially, the reason $hkh'k'$ is in $HK$ if say $H$ is normal is because the normality of $H$ gives us, for each $k\in K$ , a bijection $\varphi_k:H\to H$ defined by $h''\mapsto kh'k^{-1}$ . This is the same as $kh'\mapsto h''k$ , which lets us turn every $hkh'k'$ into $hh''kk'\in HK$ , making $HK$ closed under product as required.

Note that the key part that makes $HK$ a subgroup is not the normality of $H$ , but the bijection $\varphi_k:H\to H$ it defines for every $k\in K$ . This bijection lets us turn every instance of $kh'$ into $\varphi_k(h')k$ for some element $\varphi_k(h')\in H$ . Above, the bijection was simply conjugation by $k$ , but $HK$ is a subgroup as long as we can define $h'\mapsto\varphi_k(h')$ . To see this clearly, suppose you have $hkh'k'$ again, and we want to prove it is in $HK$ . Then since every instance of $kh'$ is equal to $\varphi_k(h')k$ for some $\varphi_k(h')\in H$ , we get $h\varphi_k(h')kk'$ and we are done since $h\varphi_k(h')\in H$ and $kk'\in H$ .
Theorem: The product $HK$ of two subgroups $H,K\le G$ is a subgroup of $G$ if each $k\in K$ defines a bijection $\varphi_k:H\to H$ such that $khk^{-1}\in H$ . khk^{-1} k(kh^{-1}){-1}

The proof of this is similar to the previous one.

Both subgroups contain $e$ , so their product contains $e$ .

Both subgroups $H,K$ are closed under product. Then take an arbitrary element $(h_1k_1)(h_2k_2)\in HK$ . This element is equivalent to

Since either $H$ or $K$ is normal, $k_1h_2$ is equal to some $h_2k_1'$ (if $K$ is normal) or some $h_2'k_1$ (if $H$ is normal). Either way, we obtain some $(h_1h_2')(k_1'k_2)$ which is in $HK$ , so $HK$ is closed under product.

Both subgroups are closed under inverses. Then take an arbitrary element $hk\in HK$ , whose inverse is $(hk)^{-1}=k^{-1}h^{-1}$ . Since either $H$ or $K$ is normal, $k^{-1}h^{-1}$ is equal to some $h^{-1}k'$ (if $K$ is normal) or some $h'k^{-1}$ (if $H$ is normal). Either way, we obtain some $h'k'$ which is in $HK$ , so $HK$ is closed under inverse.

Since the product $HK$ contains identity, and is closed under product and inverses, it is also a subgroup of $G$ .
Just like with the direct product, we can generalize this to $H,K$ that are not subgroups of the same group. Define the semidirect product $H\rtimes_\varphi K$ as the same thing as the direct product $H$ , except the product $(h,k)\cdot(h',k')$ is not $(hh',kk')$ but instead $(h\varphi_k(h'),kk')$ . The idea for the notation $\rtimes$ is that it contains a small $\lhd$ that indicates that $H$ acts ‘normally’ in the sense that $\varphi_k(h')$ exists for each $k\in K$ . Indeed:

Theorem: $H$ is a normal subgroup of the semidirect product $H\rtimes\varphi K$ .

This means that

This lets us define a classification theorem for semidirect products. IF $G=HK$ for subgroups $H,K\le G$ where $H\lhd G$ , and $H\cap K$ is trivial then $G$ is isomorphic to $H\rtimes_\varphi K$ where $\varphi$ is conjugation.

TODO wreath product?

zappa-step product?

November 9, 2023. Exploration 3: Products and quotients
Questions:
- How should we view quotient groups and simple groups?
- Can we derive properties of $G/H$ knowing things about $G$ and $H$ , and vice versa? For example, what is $Z(G/H)$ ?
- How do we view a group in terms of its normal subgroups?
Recall that we can quotient any group by one of its normal subgroups. This means we can factor any group $G$ into two groups: a normal subgroup $N$ and the quotient $G/N$ .

Groups whose only normal subgroups are the trivial subgroup $\{e\}$ and itself are called simple groups. That means they can only be factored into the trivial group and itself, much like how prime numbers can only be factored into $1$ and itself. Because of this, simple groups are like “prime factors” of groups.

Theorem: Every group of prime order is simple.

According to Lagrange’s Theorem, the order of a subgroup divides the order of its group. But if the order of the group is a prime $p$ , then the only divisors of $p$ are $1$ and $p$ , implying that the only subgroups of $G$ are $1$ and $G$ . Therefore $G$ is simple.

When a group is abelian, it has the nice property that every subgroup is necessarily normal.

Theorem: Every subgroup of an abelian group is normal.

For abelian groups, every element is in the center, so every element is in its own conjugacy class, which means every subgroup is composed of conjugacy classes, therefore normal.

This means that the subgroup structure of an abelian group also serves as its factorization. In fact, it is a theorem that every finite abelian group can be decomposed into simple groups of prime order. To prove this, we first prove a fact about p-groups, groups with prime power order $p^n$ .
Lemma: Given an abelian $p$ -group $G$ of order $p^n$ , for every element $g\in G$ there is some subgroup $K\le G$ such that $G\iso\<g\>\times K$ .

Recall the Internal Direct Product Theorem proved earlier, which requires us to find normal subgroups $H,K\lhd G$ with trivial intersection where $G=HK$ and $|G|=|H||K|$ . Then $G\iso H\times K$ .

Since $G$ is abelian, all subgroups are normal, so we just need to find subgroups that satisfy this property.

Given that we’re looking for $G\iso\<g\>\times K$ , we must have $H=\<g\>$ , where we choose $g$ to be an element of maximal order in $G$ , so that no other element generates $g$ . This constrains our possible $K$ in a few ways:

Since we need $H\cap K=\{e\}$ , no non-identity element in $K$ can have $g$ as a factor, otherwise it would be in $H=\<g\>$ .

Since we need $G=HK$ , $K$ needs to be a complement to $H$ in the sense that every $g\in G$ is equal to some product $hk$ for $h\in H,k\in K$ .

Since we need $|G|=|H||K|$ , that product must be unique.

Because $G$ is a $p$ -group, we may take $G=\<g\>K$ where $K\iso G/\<g\>$ . Intuitively, $K$ is the set of representatives in $G$ of every coset in $G/\<g\>$ . (TODO prove).

The intersection $\<g\>\cap K$ is trivial because any element of $\<g\>$ maps to the identity coset in $G/\<g\>$ , and any member of $K$ is a representative for some coset in $G/\<g\>$ . Therefore, any element in both $\<g\>$ and $K$ must be a representative of the identity coset in $G/\<g\>$ , which can only be $e$ . Thus the intersection is trivial.

Now we prove that $G=HK$ . Since

TODO left off here
Fundamental Theorem of Finite Abelian Groups: Every finite abelian group $G$ can be decomposed into a direct product of $p$ -groups (cyclic groups of prime power order).

Use the lemma to decompose $G$ as a direct product of $p$ -groups. Then to each $p$ -group apply the following:

From the previous proof, for any element $g\in G$ with maximal order in $G$ , we have $G\iso\<g\>\times H$ , where $\<g\>$ is cyclic by definition and $H$ is isomorphic to a direct product of cyclic groups by induction.

TODO jordan holder theorem (factorization is unique up to reordering)

Now that we’ve fully decomposed finite abelian groups into cyclic groups, let’s deal with infinite abelian groups. The interesting ones are the finitely-generated abelian groups – those that are generated with some minimal finite set of generators.

Recall that abelian groups have the property that every subgroup is normal. (proof) In particular, the subgroup of every element of finite order $T(G)$ is normal, so you can imagine factoring the group into $T(G)$ and $G/T(G)$ . The part $T(G)$ is known as the torsion subgroup or the torsion of $G$ , and the part $G/T(G)$ is the torsion-free quotient of the group.

We already know that the torsion of $G$ , being a finite abelian group, can be factored into a direct product of finite cyclic groups. We’ll now show that a torsion-free abelian group can also be factored into a direct product of infinite cyclic groups.

First, we show that torsion-free abelian groups are free, i.e. has a finite subset called a basis, where every element of the group is expressible as an integer linear combination of the basis.
Theorem: The torsion-free part of a finitely generated abelian group $F=G/T(G)$ is free.

First, view $F$ as an additive group, where $+$ is the group product and multiplication by a factor is group exponent.

One way to prove a group is free is to prove that there’s a basis. We can prove that a minimal set of generators $X=\{x_1,x_2,\ldots,x_n\}$ for the group is a basis if we can only get the identity element $e$ via $c_1x_1+c_2x_2+\ldots+c_nx_n=0$ by setting all the $c_i$ to $0$ . Since our group is finitely generated, a minimal set of generators $X$ exists.

Focus on $c_1x_1+c_2x_2$ and replace it with $c_1(x_1+kx_2)+(c_2-kc_1)x_2$ .

Note that the new basis element $x_1+kx_2$ is simply $x_1$ shifted by some multiple of $x_2$ , and since $X$ is a minimal basis, $x_1+kx_2$ should not be equal to any other basis element.

However, looking at the coefficient of $x_2$ this means we can subtract multiples of any coefficient from any coefficient by shifting the basis elements. If we keep doing this between $c_1$ and $c_2$ , then by Euclid’s algorithm we eventually end up with $c_1=c_2=\pm gcd(c_1,c_2)$ .

If we do this for all nonzero coefficients we get $\pm gcd(c_1,c_2,\ldots,c_n)$ .

Since $F$ is torsion-free, it has no (non-identity) elements of finite order, so there are only two cases. If every $c_i$ is $0$ , then $gcd(c_1,c_2,\ldots,c_n)=0$ . Otherwise, we can assume $gcd(c_1,c_2,\ldots,c_n)=1$ , since we note that $c_1x_1+c_2x_2+\ldots+c_nx_n=0$ is still true if we divide all $c_i$ by their GCD.

But (assuming $c_1$ is one of the nonzero coefficients) if $c_1=\pm 1$ in the equation $c_1x_1+c_2x_2+\ldots+c_nx_n=0$ , then we can write $c_2x_2+\ldots+c_nx_n=-c_1x_1=\pm x_1$ , meaning that $x_1$ is actually a linear combination of other basis elements. This means in the case that not all $c_i$ are $0$ , $X$ is actually not a minimal set.

Therefore the only solution to $c_1x_1+c_2x_2+\ldots+c_nx_n=0$ is if all $c_i$ are 0, thus $X$ is a basis for the free group $F$ .
Theorem: The torsion-free part of a finitely generated abelian group $F=G/T(G)$ factors into a direct sum of cyclic groups.

Since $F$ is a torsion-free abelian group, it is free, and has a basis $X$ . Let $n$ be the size of $X$ .

Since $F$ is free, you can write every element as a linear combination $c_1x_1+c_2x_2+\ldots+c_nx_n$ with $c_i\in\ZZ$ and $x_i\in X$ .

This is equivalent to a tuple $(c_1,c_2,\ldots,c_n)\in\ZZ^n$ . Since this represents all elements of $\ZZ^n$ as well, you can define an isomorphism between the two.

This means $F$ is isomorphic to $\ZZ^n$ , which is by definition a direct sum of the infinite cyclic group $\ZZ$ .
Structure Theorem for Finitely Generated Abelian Groups: All finitely generated abelian groups $G$ factor into a direct product of cyclic groups which are either $\ZZ$ or of prime power order.

This is a result of the previous theorems. If you split $G$ into a torsion subgroup $T(G)$ and a torsion-free part $G/T(G)$ , then the torsion subgroup factors into a direct product of cyclic groups of prime power order (proof) and the torsion-free part factors into a direct sum of copies of $\ZZ$ (proof).

It remains to show that $T(G)\times G/T(G)\iso G$ .

Let $K$ contain the torsion-free elements of $G$ ( $=\{e\}\cup(G-T(G))$ ). Since $T(G)$ is torsion and $G/T(G)$ is torsion-free, you can uniquely write any element of $G$ as a product of their torsion and torsion-free components $tk$ where $t\in T(G)$ and $k\in K$ . This is because in an abelian group, any two ways of writing $g=tk$ result in the same $t$ and $k$ being chosen: if $g=t_1k_1=t_2k_2$ , then $t_1t_2^{-1}=k_1^{-1}k_2$ and the only element shared between the torsion LHS and the torsion-free RHS is the identity $e$ , so $t_1t_2^{-1}=k_1^{-1}k_2=e$ and therefore $t_1=t_2$ and $k_1=k_2$ .

Since every element of $g$ is represented by the pair $(t,k)$ there is a bijection between $G$ and $T(G)\times K$ . Since $K$ contains all the torsion-free elements of $G$ , it is isomorphic to the torsion-free part $G/T(G)$ and therefore $G\iso T(G)\times G/T(G)$ .
In this section, we learn how to decompose a handful of non-abelian groups.

TODO

klein, cyclic, octonion, prime elems, 2p elements, prime^2 elements, dihedral group

In this section, we show one way to decompose non-abelian groups.

for non-abelian groups, we can obtain the abelianization of the group by quotienting by its commutator subgroup, as described in the last exploration. That splits it into two factors: the quotient and the non-abelian part of the group, the commutator subgroup $G'$ , which is also known as the derived subgroup $G^{(1)}$ .

We can keep taking the derived subgroup of $G^{(1)}$ , obtaining $G^{(2)},G^{(3)}$ etc, until we arrive at some limit $G^{(n)}$ . If the limit is the trivial group $\{e\}$ , we say that the original group $G$ is solvable.

Solvable groups are nice because every derived subgroup is a normal subgroup of the original group, and so solvable groups always have the derived series $\{e\}\lhd\ldots\lhd G^{(2)}\lhd G^{(1)}\lhd G$ .

That means that one way to understand the composition of a group is to prove it is solvable. Let’s prove the solvability of certain classes of groups:

Theorem: Every abelian group is solvable.

Abelian groups $G$ have a trivial derived subgroup, so we immediately have $\{e\}\lhd G$ .
Theorem: $G_1$ is solvable iff it has a series $\{e\}\lhd\ldots\lhd G_3\lhd G_2\lhd G_1$ where every quotient (between two adjacent groups) is abelian.

( $\to$ )

If $G_1$ is solvable, then by definition it has a derived series $\{e\}\lhd\ldots\lhd G^{(2)}\lhd G^{(1)}\lhd G$ .

We know that quotienting by a derived subgroup gives an abelianization, therefore each quotient is abelian.

( $\from$ )

If every quotient $G_n/G_{n+1}$ in the series is abelian, then it implies $G_{n+1}$ includes $G_n$ ’s derived subgroup $G_n^{(1)}$ . (Proof)

The derived subgroup is always a normal subgroup, so we have $G_n^{(1)}\lhd G_{n+1}\lhd G_n$ .

Normal subgrouping is transitive, so $G_n^{(1)}\lhd G_n$ .

Since this is true for every $G_n$ , there is a derived series $\{e\}\lhd\ldots\lhd G^{(2)}\lhd G^{(1)}\lhd G$ . Therefore $G$ is solvable.
Theorem: Every group with prime order is solvable.

The order of an element must divide the order of the group (Lagrange’s Theorem).

Since a group with prime order $p$ is only divisible by $1$ and $p$ , the order of every non-identity element must be $p$ .

That means the group is cyclic, therefore abelian, therefore solvable.
TODO sylow theory

November 11, 2023. Exploration 2: Order of elements
So, we can ask an arbitrary group $G$ many questions. What are its subgroups? What relations exist on elements of $G$ ? Which elements commute with which?

But given only a single arbitrary element $g\in G$ , we suddenly run out of interesting questions to ask. We already know the inverse is $g^{-1}$ . We already know we can take its product with itself to get elements like $g^2$ , $g^3$ , etc. But interesting facts like whether $g$ or $g^2$ is the identity element, and what subgroup $g$ generates, all depend on knowing one thing: what is the order of $g$ ?

The order of an element $g$ is the least positive integer $n$ for which $g^n=e$ , and is denoted $o(g)$ . If taking powers of $g^n$ never results in $e$ , we say that $o(g)$ is infinite.

If we know the order of $g$ , then we can deduce almost every interesting property of $g$ . For instance, $o(g)=1$ means $g$ is the identity. If $o(g)=|G|$ , then $g$ generates $G$ (and thus $G$ is cyclic). It turns out the order of $g$ decides so much, that if you are given any information about $g$ alone, you can probably translate it to a statement about the order of $g$ (and vice versa).

Here’s an example:

Theorem: An element $g$ of order $n$ generates a subgroup $\<g\>$ of order $n$ .

Given that the order of $g$ is $n$ , we know that $g^n$ is the lowest positive power of $g$ equal to the identity $e$ . Then $g$ generates the subgroup $\<g\>=\{e,g,g^2,\ldots,g^{n-1}\}$ , which has exactly $n$ elements, thus $|\<g\>|=n=o(g)$ .

In this section, we learn how to calculate the order of elements.

For a finite group, the main way to calculate the order of an element is by applying a corollary of Lagrange’s Theorem.

Corollary: For a finite group, the order of every element divides the order of the group.

Recall that an element $g\in G$ can generate a subgroup $\<g\>\le G$ . Since $\<g\>$ is a subgroup of a finite group, apply Lagrange’s Theorem to show that $|\<g\>|$ divides $|G|$ . But $|\<g\>|=o(g)$ , so $o(g)$ also divides $|G|$ .

This theorem restricts the possible orders of elements to divisors of the group’s order $|G|$ . For instance, if $|G|=30$ , then elements in $G$ can only have orders $\{1,2,3,5,6,10,15,30\}$ . In fact, the following theorem says that elements must exist for the prime orders, $\{2,3,5\}$ :
Cauchy’s Theorem: Every finite group $G$ whose order is divisible by a prime $p$ contains an element of order $p$ .

Consider $G^p$ , which is the direct product of $G$ with itself $p$ times. Its elements are $p$ -tuples of elements from $G$ . We will proceed by a counting argument, first by defining a set $S$ and providing two ways to count $|S|$ .

Let $S$ be the subset of $G^p$ where taking the product of all components in the $p$ -tuple gives the identity $e_G$ . That is, $S=\{(g_1,g_2,\ldots,g_p)\in G^p\mid g_1g_2\ldots g_p=e_G\}$

To find $|S|$ , note that the first $p-1$ components of each $p$ -tuple in $G^p$ can be chosen arbitrarily from (finite) $G$ , but the final component must be the inverse of the product of the first $p-1$ components. Therefore, $|S|=|G|^{p-1}$ . Given that $|G|$ is divisible by a prime $p$ , we have $|G|=pm$ for some $m$ . Then $|S|=(pm)^{p-1}$ .

Any permutation of a $p$ -tuple in $S$ is a $p$ -tuple in $S$ , since order doesn’t matter when we take product. Then consider two $p$ -tuples equivalent when one is a cyclic permutation of the other. This divides $S$ into equivalence classes of two sizes: elements of the form $(g,g,\ldots,g)$ belong in an equivalence class of size $1$ , while any other equivalence class must be of size $p$ because $p$ is prime.

Towards contradiction, assume that $\{(e_G,e_G,\ldots,e_G)\}$ is the only equivalence class of size $1$ . Then let $k$ be the number of equivalence classes of size $p$ , so that $|S|=1+pk$ .

Therefore $|S|=(pm)^{p-1}=1+pk$ . Taking mod $p$ gives $0\equiv 1\mod p$ , contradiction.

Therefore there must be more equivalence classes of size $1$ , implying that $S$ contains some nontrivial $(g,g,\ldots,g)$ . Membership in $S$ implies $g^p=e_G$ , so $g\in G$ is an element of order $p$ .
If your group is a direct product, then you can deduce the order of its elements via the order of each component.

Theorem: The order of an element $(g_1,g_2,\ldots,g_n)$ in a direct product $G_1\times G_2\times\ldots\times G_n$ is equal to $\lcm(g_1,g_2,\ldots,g_n)$ .

Proving this for $(g_1,g_2)\in G_1\times G_2$ is enough to prove the theorem by induction. Let $n=o(g_1)$ and $m=o(g_2)$ , so that $g_1^n=e_1$ and $g_2^m=e_2$ . Then $(g_1,g_2)^k=(g_1^k,g_2^k)=(e_1,e_2)$ implies $k$ is a multiple of both $n$ and $m$ . Since order of an element is the least such multiple, the order of $(g_1,g_2)$ must be $\lcm(n,m)$ .

In this section, we prove facts about subgroup order.

Knowing the order of elements lets you identify subgroups. For instance, if $g$ is order $4$ , then $\<g\>$ is a cyclic subgroup of order $4$ . From there we can construct more interesting subgroups.

Recall that for subgroups $H$ and $K$ , we have $|HK|=|H||K|/|H\cap K|$ .

This tells us that having trivial intersection is desirable for subgroups. So when do subgroups have trivial intersection?
Lemma: Any two subgroups $H,K$ with coprime orders have trivial intersection.

The intersection of two subgroups is a subgroup of both: $H\cap K\le H$ and $H\cap K\le K$ .

By Lagrange, $|H\cap K|$ is a divisor of both $|H|$ and $|K|$ .

Since $|H|$ and $|K|$ are coprime, their only shared divisor is $1$ , so $|H\cap K|=1$ , and therefore $H\cap K=\{e\}$ .
Theorem: Any two distinct subgroups $H,K$ with prime order have trivial intersection.

There are two cases: either $|H|\ne |K|$ or $|H|=|K|=p$ for some prime $p$ .

If $H,K$ have different prime orders, then they have coprime orders and by the previous theorem the intersection is trivial.

Otherwise, $|H|=|K|=p$ for some prime $p$ . By Lagrange, $|H\cap K|$ is a divisor of both $|H|$ and $|K|$ . Since $|H|=|K|=p$ is prime, $|H\cap K|$ is either $1$ or $p$ . It can’t be $p$ since $H,K$ are distinct. So the intersection is trivial.
In this section, we determine the structure of groups based on their order.

One application of order is to determine whether a subset is a conjugacy class.
Theorem: Every element of the same conjugacy class has the same order.

The key is knowing that, given $g^n=e$ , $(hgh^{-1})^n=(hgh^{-1})(hgh^{-1})\ldots(hgh^{-1})=hg^nh^{-1}=heh^{-1}=e$

This shows that the order of $hgh^{-1}$ is at most the order of $g$ .

We can show that the order of $hgh^{-1}$ cannot be lower than the order of $g$ by contradiction.

Suppose $m=o(hgh^{-1})<o(g)$ , implying $(hgh^{-1})^m=e$ . Then $\begin{aligned} (hgh^{-1})^m&=e\\ hg^mh^{-1}&=e\\ g^m&=h^{-1}eh\\ g^m&=e\\ \end{aligned}$ which is not possible since $m<o(g)$ .

Therefore the order of every $hgh^{-1}$ is exactly the order of $g$ .
Because of this,

Knowing the order of a given group sometimes lets you identify it outright. Here are some examples:
Theorem: If $|G|$ is prime, $G$ is cyclic.

Since $|G|$ is prime, $|G|\ge 2$ and therefore there is a non-identity element $g$ with order $\gt 1$ .

By Lagrange’s Theorem, the order of every element divides $|G|$ . Since $|G|$ is prime and $o(g)\gt 1$ , the order of $g$ can only be $|G|$ .

But that means $g$ generates $G$ , therefore $G$ is cyclic.
Theorem: If $|G|=p^2$ for some prime $p$ , $G$ is abelian.

The center $Z(G)$ must be a subgroup, which by Lagrange’s Theorem must have order equal to one of $1,p,p^2$ .

Since $G$ is a $p$ -group and therefore its center has order divisible by $p$ , the center of $G$ cannot be order $1$ .

If the center is order $p$ , then $|G/Z(G)|=p$ implies $G/Z(G)$ is cyclic. But that means $G$ is abelian.

If the center is order $p^2$ , it can only be the entire group $Z(G)=G$ which makes $G$ abelian by definition.
Theorem: A group with order $pq$ where $p,q$ are primes is either abelian or centerless.

Recall that a group is abelian iff the quotient group $G/Z(G)$ is cyclic, and that a group is cyclic if it has prime order. Thus if $G/Z(G)$ has prime order, then $G$ is abelian.

Again by Lagrange’s Theorem, the order of the center must divide the order of the group $pq$ , and thus the center can only be of order $1,p,q,pq$ .

If the center is order $pq$ then the group is abelian by definition.

If the center is order $1$ then the group is centerless by definition.

Otherwise, the center is prime $p$ or $q$ , and therefore $G/Z(G)$ has prime order $pq/p=q$ or $pq/q=p$ , which makes $G$ abelian.
Theorem: If $|G|=2p$ and $p$ an odd prime, then $G$ is isomorphic to either $C_{2p}$ or $D_p$ .

By Lagrange’s Theorem, $|G|=2p$ implies any element $g$ has order dividing $2p$ , i.e. $o(g)\in\{1,2,p,2p\}$ .

If every non-identity element was order $2$ then $G$ is abelian, and the set $\{1,g,h,gh\}\iso K_4$ for any non-identity $g,h$ is closed and therefore a subgroup of $G$ . By Lagrange, $|\{1,g,h,gh\}|=4$ must divide $|G|=2p$ , a contradiction. So we must have some $g$ that is order $p$ or order $2p$ .

If $g$ exists such that $o(g)=2p$ , then $g$ generates $G$ , so $G\iso C_{2p}$ .

If $g$ exists such that $o(g)=p$ , then $\<g\>$ is a subgroup of order $p$ . WTS $G\iso D_p$ .

$H=\<g\>$ is the unique subgroup of order $p$ . To see this, consider another subgroup $K$ of order $p$ . Since $H$ and $K$ are distinct prime order subgroups, their intersection is trivial. (proof) But then $|HK|=|H||K|/|H\cap K|=p^2$ (proof) which is greater than $|G|=2p$ , contradiction.

Then for non-identity $k\in G\setminus \<g\>$ , we know that $|\<k\>|\ne p$ since $\<g\>$ is the unique subgroup of order $p$ . But since $k$ is not identity ( $o(k)=1$ ), and $k$ does not generate $G$ ( $o(k)=2p$ ), we must have $o(k)=2$ for every $k$ not in $\<g\>$ .

Since $gk\notin \<g\>$ , $o(gk)=2$ as well. This means both $k$ and $gk$ are self-inverse: $gk=(gk)^{-1}=k^{-1}g^{-1}=kg^{-1}$ .

The group $G=\<g,k:g^p=k^2=e,gk=kg^{-1}\>$ is exactly (isomorphic to) the dihedral group $D_p$ .
In this section, we introduce the exponent of a group.

The exponent $\exp(G)$ of a group $G$ is the LCM of the order of every element of the group. The idea is that $\exp(G)$ is the smallest integer that is high enough so that $g^{\exp(G)}=e$ for all $g\in G$ . If there is no such integer (because some element $g$ has infinite order), the exponent is infinity.

Let’s prove some useful properties of the exponent:

Theorem: The exponent of a cyclic group $C_n$ is $n$ .

The order of the generator of $C_n$ must be $n$ , so the exponent is at least $n$ . By Lagrange’s theorem, the order of every element of $C_n$ divides its order $|C_n|=n$ , so the exponent is at most $n$ . Therefore the exponent is exactly $n$ .
Theorem: The exponent of a direct product $G=G_1\times G_2\times\ldots\times G_n$ is equal to the LCM of the exponents of each component $G_i$ .

Proving this for $G=G_1\times G_2$ is enough to prove the theorem by induction.

If either of $G_1$ or $G_2$ have infinite exponent, then LCM of their exponents is infinite, which is equal to the exponent of $G$ (infinity), and we are done.

Otherwise, assume $G_1$ or $G_2$ have finite exponents.

For every $(g_1,g_2)$ of $G$ , by definition of exponent, we know that $g_1^{\exp(G_1)}=e_{G_1}$ and $g_2^{\exp(G_2)}=e_{G_2}$ .

$\exp(G)$ is the least $k$ such that $(g_1,g_2)^k=(g_1^k,g_2^k)=(e_{G_1},e_{G_2})=e_G$ . For the middle equality to hold, $k$ must be a multiple of both $\exp(G_1)$ and $\exp(G_2)$ . Since $\exp(G)$ needs to be the least such $k$ , we have $\exp(G)=k=\lcm(\exp(G_1),\exp(G_2))$ .
Corollary: The exponent of a direct product of cyclic groups $G=C_{k_1}\times C_{k_2}\times\ldots\times C_{k_n}$ is the LCM of all $k_i$ .

This combines the previous two theorems. The exponent of a direct product is the LCM of the exponent of each $C_{k_i}$ , which is $k_i$ , thus the above statement follows.

Corollary: Every direct product of cyclic groups $G=C_{k_1}\times C_{k_2}\times\ldots\times C_{k_n}$ contains an element whose order is equal to the exponent of $G$ .

If we take $g_i$ as the generator of each cyclic group, then $(g_1,g_2,\ldots,g_n)\in G$ has order equal to the LCM of the orders of each $g_i$ , i.e. the LCM of all $k_i$ , which is exactly the exponent of $G$ .
Corollary: Every direct product of cyclic groups $G=C_{k_1}\times C_{k_2}\times\ldots\times C_{k_n}$ is cyclic iff each $k_i$ are pairwise coprime.

Since $G$ is a direct product of cyclic groups each of order $k_i$ , we have $|G|=\prod_{i=1}^n k_i$ .

( $\to$ )

Since $G$ is cyclic, it has a generator $g=(g_1,g_2,\ldots,g_n)$ of order $|G|$ , where each $g_i\in C_{k_i}$ generates $C_{k_i}$ .

Since the order of a direct product element $(g_1,g_2,\ldots,g_n)$ must be the LCM of the order of each component $g_i$ , we have $o(g)=|G|=\lcm(k_1,k_2,\ldots,k_n)$ .

But then $|G|=\prod_{i=1}^n k_i$ (above) implies $\prod_{i=1}^n k_i=\lcm(k_1,k_2,\ldots,k_n)$ . Since the LCM of each $k_i$ is equal to the product of each $k_i$ , the $k_i$ must be pairwise coprime.

( $\from$ )

By the previous proof, there is an element $g\in G$ whose order is equal to the exponent of $G$ , i.e. the LCM of each $k_i$ . But since each $k_i$ are pairwise coprime, their LCM is equal to their product. That means $o(g)=\prod_{i=1}^n k_i=|G|$ , which implies $g$ generates $G$ , so $G$ is cyclic.

November 14, 2023. Exploration 4: Homomorphisms
Questions:
- TODO
TODO normal subgroups are precisely the kernels of homomorphisms

In this section, we define group homomorphisms.

The map $\sigma:G\to H$ is a group homomorphism if it preserves the group product, i.e. $\sigma(ab)=\sigma(a)\sigma(b)$ . By preserving the group product, $\sigma$ also preserves group inverses and the identity element.

Here we’ll just use homomorphism or simply map to refer to group homomorphisms.
Theorem: Homomorphisms $\sigma:G\to H$ divide the order of each element.

Finite order means there’s some integer $n$ such that $g^n=1$ .

Then $\sigma(g)^n=\sigma(g^n)=\sigma(1)=1$ . Since $\sigma(g)^n=1$ , the order of $\sigma(g)$ must divide $n$ .
If every element of $H$ is mapped to at most once, then $\sigma$ is one-to-one, or an injective homomorphism.
Theorem: Injective homomorphisms $\sigma:G\to H$ preserve the order of each element.

We prove the contrapositive.

Let $m$ be the order of $g\in G$ and $n$ be the order of $\sigma(g)\in H$ . If $\sigma$ doesn’t preserve the order of $g$ , then $n<m$ with $m$ being a multiple of $n$ , from the last theorem.

Since $n$ is the order of $\sigma(g)$ and $m$ is a multiple of $n$ , we have $\sigma(g)^m=\sigma(g)^n=e$ , implying that both $g^m=e$ and $g^n\ne e$ map to the identity element $e$ .

Therefore $\sigma$ is not injective.
If every element of $H$ is mapped to at least once, then $\sigma$ is onto, or a surjective homomorphism. Surjections have the property of preserving positive formulas – properties written without the use of $\lnot$ . An example is being abelian: $\forall g h\in G\ldotp gh=hg$ .
Theorem: Surjective homomorphisms $\sigma:G\to H$ preserve positive formulas $\phi$ .

The proof is by induction on the positive formula $\phi$ . It can be one of three things:

Equalities $a=b$ , possibly composed together with $\land$ and $\lor$ (but not $\lnot$ )

An existential: $\phi=\exists x\ldotp \psi(x)$ for some positive formula $\psi(x)$

A universal: $\phi=\forall x\ldotp \psi(x)$ for some positive formula $\psi(x)$

The base case is when the positive formula is an equality $a=b$ . The terms on both sides only feature group product and inverses, which are both preserved by all homomorphisms, so equalities are preserved.

The first inductive case is when the positive formula is an existential: $\phi=\exists x\ldotp \psi(x)$ for some positive formula $\psi(x)$ . If $\phi$ is true then $\psi(g)$ is true for some element $g\in G$ . By induction, $\sigma$ preserves $\psi(x)$ , so $\psi(\sigma(g))$ is true in $H$ as well. Then we know that there is some element $\sigma(g)\in H$ that satisfies $\phi=\exists x\ldotp \psi(x)$ , so $\phi$ is preserved as well.

The second inductive case is when the positive formula is a universal: $\phi=\forall x\ldotp \psi(x)$ for some positive formula $\psi(x)$ . If $\phi$ is true then $\psi(g)$ is true for all elements $g\in G$ . By induction, $\sigma$ preserves $\psi(x)$ , so $\psi(\sigma(g))$ is true in $H$ as well. Since $\sigma$ is surjective, we know that $\sigma(g)$ can be every element in $H$ , so every $\sigma(g)\in H$ satisfies $\phi=\forall x\ldotp \psi(x)$ , so $\phi$ is preserved as well.
Corollary: Surjective homomorphisms preserve abelianness: $\forall g h\in G\ldotp gh=hg$ .

Corollary: Surjective homomorphisms preserve cyclicness: $\exists g\in G\ldotp\forall h\in G\ldotp\exists n\in\ZZ\ldotp h=g^n$ .

Corollary: Surjective homomorphisms preserve self-inverses $g$ : $g^2=e$ .

If $\sigma$ is both surjective and injective, then every element of $H$ is mapped to exactly once, and $\sigma$ is an isomorphism. We’ve already seen isomorphisms in the wild.

One isomorphism that exists in every group is the action of permuting the elements of $G$ . Any such mapping is an automorphism of $G$ . Essentially, we’re swapping the names of elements in $G$ – the result is an isomorphic group. One example is mapping every $g\mapsto g^{-1}$ , the inverse map.

Since automorphisms are essentially permutations, they can form a group. Every automorphism of $G$ forms $\Aut G$ , the automorphism group of G under composition.

One very special automorphism is the one that maps elements to its conjugate (by some fixed element $a\in G$ ). We call this the inner automorphism: the map $\sigma_a$ maps $g\mapsto aga^{-1}$ All the $\sigma_a$ (one exists for every $a$ ) form $\Inn G$ , the group of all Inner automorphisms. We have $\Inn G\le \Aut G\le S_G$ . - Finding $\Inn G$ is trivial (given $G$ , $\Inn G=\{g\mapsto aga^{-1}\mid a \in G\}$ ). Finding $\Aut G$ is not trivial, because it’s hard to enumerate on all the valid ways to rename elements under the operation.

Theorem: Automorphisms $\sigma:G\to H$ fix elements of unique order.

Since automorphisms are injective, they only map elements to elements of the same order. In the case of unique order elements, those elements must map to themselves.

Theorem: The elements fixed by all inner automorphisms comprise the center.

If $g$ is fixed by all inner automorphisms $\sigma_a$ , it means $aga^{-1}=g$ , or $ag=ga$ , for all $a$ . In other words, $g$ commutes with all elements $a$ and therefore is in the center.

In this section, we abstractly represent relationships between groups via homomorphisms.

Given a group homomorphism $\sigma:G\to H$ :
- The image of $\sigma$ , $\im\sigma$ , is the subset of the codomain $H$ mapped to from the domain $G$ .
- The kernel of $\sigma$ , $\ker\sigma$ , is the subset of the domain $G$ that gets sent to the identity element $e_H$ in $H$ .
Some common theorems:

Theorem: Elements are equal under a homomorphism $\sigma:G\to H$ iff they differ by $\ker\sigma$ .

For $g_1,g_2\in G$ to differ by the kernel of the homomorphism, we must have $g_1g_2^{-1}\in\ker\sigma$ , which is equivalent to saying $\sigma(g_1g_2^{-1})=e_H$ , i.e. $\sigma(g_1)=\sigma(g_2)$ . Thus, two elements that differ by an element in $\ker\sigma$ are equal under $\sigma$ .

Theorem: $\im\sigma=H$ iff $\sigma:G\to H$ is surjective.

By definition. Since surjective means the whole codomain is mapped to, it’s the same as saying that the mapped subset of $H$ is all of $H$ .
Theorem: $\ker\sigma=\{e\}$ iff $\sigma:G\to H$ is injective.

( $\to$ ) If $\sigma(a)=\sigma(b)$ , then $\sigma(ab^{-1})=e_H$ implies $ab^{-1}=e_G$ since only $e_G$ maps to $e_H$ . But then $a=b$ , therefore $\sigma$ is injective.

( $\from$ ) Since injective homomorphisms preserve the order of each element, and the only order $1$ element in any group is $e$ , only $e_G$ gets mapped to $e_H$ , so the kernel is trivial.
Theorem: $\im\sigma\iso G$ iff $\sigma:G\to H$ is injective.

( $\to$ ) If $\im\sigma\iso G$ , then by the isomorphism every element of $\im\sigma$ is necessarily mapped to once, which means every element of $H$ is mapped to at most once, therefore $\sigma$ is injective.

( $\from$ ) The elements mapped to by $\sigma$ are exactly $\im\sigma$ . And since $\sigma$ is injective, they are mapped to exactly once, therefore $G\to\im\sigma$ is an isomorphism.
Theorem: The kernel $\ker\sigma$ is always a normal subgroup.

Let $k$ denote elements of the kernel $\ker\sigma$ .

The kernel is a subgroup because:

$\sigma(k_1k_2)=\sigma(k_1)\sigma(k_2)=ee=e$ implies $k_1k_2$ is in the kernel, and

$\sigma(k^{-1})=\sigma(k)^{-1}=e^{-1}=e$ implies $k^{-1}$ is in the kernel.

The kernel is invariant under conjugation:

$\sigma(gkg^{-1})=\sigma(g)\sigma(k)\sigma(g^{-1})=\sigma(g)e\sigma(g^{-1})=e$ implies $gkg^{-1}$ is in the kernel for arbitrary $g$ in the group.

So the kernel is invariant under conjugation, therefore normal.
Theorem: The image $\im\sigma$ is always a subgroup.

The image is closed under product and inverse, therefore a subgroup of the codomain:

Product: $\sigma(g_1)\sigma(g_2)=\sigma(g_1g_2)$

Inverse: $\sigma(g)^{-1}=\sigma(g^{-1})$
Universal property of the quotient: Suppose you have a quotient $G/H$ . Then any homomorphism $\phi:G\to K$ whose kernel contains $H$ lets you factor out a unique injective homomorphism $\tilde{\phi}:G/H\to K$ such that $\tilde{\phi}\circ\pi=\phi$ below:

Since $\tilde{\phi}\circ\pi(g)=\phi(g)$ implies $\tilde{\phi}([g])=\phi(g)$ , the homomorphism $\tilde{\phi}$ is forced to be the map $[g]\mapsto \phi(g)$ , and therefore this map is unique.

$\tilde{\phi}$ is well-defined: we need to show that given $[g]=[h]$ , $\phi(g)=\phi(h)$ .

From $[g]=[h]$ we know $[g^{-1}h]=[e]$ , so $g^{-1}h$ is in $H$ . Since the kernel contains $H$ , $\phi(g^{-1}h)=e$ .

Then $\phi(g)=\phi(g)\phi(g^{-1}h)=\phi(h)$ .

$\tilde{\phi}$ preserves product and inverse: they are basically inherited from $\phi$ .

$\tilde{\phi}([a][b])=\tilde{\phi}([ab])=\phi(ab)=\phi(a)\phi(b)$

$\tilde{\phi}([a]^{-1})=\tilde{\phi}([a^{-1}])=\phi(a^{-1})=\phi(a)^{-1}$

Therefore $\tilde{\phi}$ is a homomorphism.

To show that it is injective, we use the same argument as the one used for well-definedness, but going in the opposite direction.

Assuming $\phi(g)=\phi(h)$ , we know $e=\phi(g^{-1}h)$ .

This implies $g^{-1}h$ is in the kernel of $\tilde{\phi}$ , so $[g^{-1}h]=[e]$ , which implies $[g]=[h]$ .

Therefore $\tilde{\phi}$ is an injective homomorphism such that $\tilde{\phi}\circ\pi=\phi$ .
First Isomorphism Theorem: Given $\sigma:G\to H$ , $G/\ker\sigma\iso\im\sigma$ .

First, we restrict the codomain of $\sigma$ to its image $\im\sigma$ , obtaining a surjective homomorphism $\phi:G\to\im\sigma$ .

Apply the universal property of the quotient to $\phi$ . Then there is an injective homomorphism $\tilde{\phi}:G/\ker\phi\to\im\phi$ , such that $\tilde{\phi}\circ\pi=\phi$ .

By definition, $\pi$ and $\phi$ are both surjective. Because $\tilde{\phi}\circ\pi=\phi$ , we know $\tilde{\phi}$ must be surjective as well.

Since $\tilde{\phi}$ is both injective and surjective, it is an isomorphism between $G/\ker\phi$ and $\im\phi$ .

Since $\phi$ is the same as $\sigma$ but with a restricted codomain, they have the same kernel and image. Therefore $G/\ker\sigma\iso\im\sigma$ .
Because of the First Isomorphism Theorem, it is well-known that any homomorphism $\sigma:G\to H$ can be split into three parts: a surjective projection $G\to G/\ker\sigma$ , a bijective renaming $G/\ker\sigma\to\im\sigma$ , and an injective inclusion map $\im\sigma\to H$ .

In this section, we describe properties of homomorphisms with only diagrams.

Consider two homomorphisms $\sigma:A\to B,\tau:B\to C$ where the image of $\sigma$ is the kernel of $\tau$ . In other words, everything $\sigma$ maps to will get mapped to $e_C$ by $\tau$ . This property is called exactness, and we say this sequence is exact at $B$ , and any sequence of homomorphisms $A\xrightarrow{\sigma}B\xrightarrow{\tau}C\xrightarrow{\upsilon}\ldots$ with the property that “the image of one homomorphism is the kernel of the next” is called an exact sequence.

When $A$ is $\{e\}$ , the resulting exact sequence is a way to show $\tau$ is injective without explicitly writing “ $\tau$ is injective”.

Theorem: The exact sequence $\{e\}\xrightarrow{\sigma}B\xrightarrow{\tau}C$ implies $\tau$ is injective.

Since the image of $\sigma$ is necessarily $\{e\}$ (homomorphisms only map identity to identity) we know that the kernel of $\tau$ is $\{e\}$ as well, and homomorphisms with trivial kernel are injective.

When $C$ is $\{e\}$ , the resulting exact sequence is a way to show $\sigma$ is surjective without explicitly writing “ $\sigma$ is surjective”.

Theorem: The exact sequence $A\xrightarrow{\sigma}B\xrightarrow{\tau}\{e\}$ implies $\sigma$ is surjective.

Since the kernel of $\tau$ is necessarily $\{B\}$ (because everything from $B$ got mapped to $e$ ) we know that the image of $\sigma$ is $B$ as well, implying $\sigma$ is surjective.

When we combine the two, we get what is called a short exact sequence, which is an exact sequence of the form $\{e\}\to A\xrightarrow{\sigma}B\xrightarrow{\tau}C\to\{e\}$ The existence of a short exact sequence implies $\sigma$ is injective and $\tau$ is surjective, and also a slew of other facts about specific $A,B,C$ :
Theorem: The short exact sequence $\{e\}\to A\xrightarrow{\sigma}B\xrightarrow{\tau}C\to\{e\}$ implies $A$ is a normal subgroup of $B$ .

Since $\sigma$ is injective, $A\iso\im\sigma$ , By exactness, we have $\im\sigma=\ker\tau$ , therefore $A\iso\ker\tau$ . The kernel of $\tau$ is a normal subgroup of its domain $B$ .
Theorem: If $\sigma$ is an inclusion map, the short exact sequence $\{e\}\to A\xrightarrow{\sigma}B\xrightarrow{\tau}C\to\{e\}$ implies $C\iso B/A$ .

From the previous proof we know that in a short exact sequence, $\ker\tau\iso A$ . If $\sigma$ is an inclusion map, then $\ker\tau=A$ .

Recall the first isomorphism theorem for $\tau$ is $\im\tau\iso B/\ker\tau$ . In this case, we have

$\ker\tau=A$ .

$\im\tau=C$ since the last map maps to $\{e\}$ .

Then $C\iso B/A$ .
If $B\iso A\times C$ , then we say the above sequence splits, and is called a split exact sequence. Usually we’re trying to prove that a sequence is split in order to prove $B\iso A\times C$ . To do so requires proving the existence of an ‘inverse’ to $\sigma$ or $\tau$ :
Splitting lemma: The short exact sequence $\{e\}\to A\xrightarrow{\sigma}B\xrightarrow{\tau}C\to\{e\}$ is split if there is an left inverse homomorphism $\sigma^{-1}:B\to A$ , or if there is an right inverse homomorphism $\tau^{-1}:C\to B$ .

Left inverse homomorphisms are homomorphisms $\sigma^{-1}$ such that $\sigma^{-1}\circ\sigma=\id_A$ .

Right inverse homomorphisms are homomorphisms $\tau^{-1}$ such that $\tau\circ\tau^{-1}=\id_C$ .

If we have $\sigma^{-1}:B\to A$ , then:

Elements of $B$ are writable as the product of some element in $\ker\sigma^{-1}$ and some element in $\im\sigma$ . Proof: $\begin{aligned} &~\sigma^{-1}(b(\sigma\sigma^{-1}b)^{-1})\\ =&~\sigma^{-1}(b\sigma((\sigma^{-1}b)^{-1}))\\ =&~\sigma^{-1}b \sigma^{-1}\sigma((\sigma^{-1}b)^{-1})\\ =&~\sigma^{-1}b (\sigma^{-1}b)^{-1}\\ =&~e\\ \end{aligned}$ so we have $b(\sigma\sigma^{-1}b)^{-1}\in\ker\sigma^{-1}$ . Also, we have $\sigma\sigma^{-1}b\in\im\sigma$ . The product of the two is $b$ .

This means we can write elements of $B$ as a product of elements of $\ker\sigma^{-1}$ and $\im\sigma$ , but it’s only unique if the two subgroups share no elements in $B$ (besides identity).

Elements $k\in\ker\sigma^{-1}$ satisfy $\sigma^{-1}(k)=e$ . If $k$ is also in the image of $\sigma$ , then $k=\sigma(a)$ for some $a\in A$ , then $\sigma^{-1}(k)=e$ becomes $\sigma^{-1}(\sigma(a))=e$ which implies $a=e$ and therefore $k=\sigma(a)=e$ . So the only element shared by $\ker\sigma^{-1}$ and $\im\sigma$ is $e$ .

Therefore we can uniquely write elements of $B$ as a product of elements of $\ker\sigma^{-1}$ and $\im\sigma$ , and thus $B$ is a direct product of the two.

By exactness, $\im\sigma$ is isomorphic to $A$ . What’s left to prove is that $\ker\sigma^{-1}$ is isomorphic to $C$ .

By exactness, $\im\sigma=\ker\tau$ , and $\tau$ is surjective. So every $c\in C$ is equal to $\tau(ki)$ where $ki\in B$ , $k\in\ker\sigma^{-1}$ and $i\in\im\sigma=\ker\tau$ . Since $i\in\ker\tau$ this is equal to $\tau(k)$ . Therefore every $c=\tau(k)$ for some $k\in\ker\sigma^{-1}$ , so we have a bijection $\ker\sigma^{-1}\to C$ , so they are isomorphic.

Since $B$ is isomorphic to a direct product of $A\times C$ , the sequence is split.

The argument for $\tau^{-1}:C\to B$ is essentially identical to the above.
In summary, a short exact sequence describes an embedding of $A$ into $B$ , and then $C\iso B/A$ captures the structure that $A$ doesn’t account for. When $\sigma$ or $\tau$ has an inverse, then the splitting lemma says you can directly piece together the structures $A$ and $C$ to obtain $B$ , via $A\times C\iso B$ .

todo

Homomorphisms can be identified by its effect on G’s generating set

Characteristic subgroups: TODO the intuition. A subgroup H ≤ G where for every automorphism σ : G → G, σ(H) = H. H is characteristic in G. I think the intuition is “the subgroup is resistant to renaming”. + Example: Z(G) is characteristic in G + This comes up more when we start discussing group actions

November 16, 2023. Exploration 5: Permutation groups
Questions:
- TODO
Recall that a permutation of a set is a reordering of elements and is usually written like $(1~4~2~8~5~7)$ . Given an arbitrary set $X$ , the permutations on the set are written $S_X$ .

This notation is called cycle notation; $(1~4~2~8~5~7)$ is a cycle. More complex permutations are composed of cycles: the permutation $(1~4)(2~3)$ maps $1\mapsto 4\mapsto 1$ and $2\mapsto 3\mapsto 2$ . Ordering matters: the rightmost permutation is performed first.

The identity permutation (which does nothing to the set) can be written as a length $1$ cycle, like $(1)$ .

Composing permutations

Consider the permutation $(2~3)(3~1)$ . This first swaps $3$ and $1$ , and then swaps $2$ and $3$ . Since the net effect is the same as the cycle $(1~2~3)$ , we consider them to be the same permutation. That is, the set $\{(2~3)(3~1),(1~2~3)\}$ is a single-element set.

Permutations can be used to represent groups. Here’s how:
Theorem: For $X$ a finite set of size $n$ , any permutation group $S_X\iso S_n$ , the symmetric group.

If $X$ is finite, there is a bijection from $X$ to the set $\{1\ldots n\}$ via renaming.

We can imagine transforming every $\lambda\in S_X$ to be a permutation in $S_n$ . Such a permutation would start from $\{1\ldots n\}$ , go to $X$ via the bijection, apply $\lambda$ , and go back to $\{1\ldots n\}$ via the bijection.

Then $S_X\iso S_n$ .
Cayley’s Theorem: Every finite group $G$ is isomorphic to some permutation group.

We need to find an injective homomorphism $\sigma$ between a finite group $G$ and $S_G$ . If we can, then $G\iso\theta\sigma(G)$ .

$\sigma(G)\subseteq S_G$ by injectivity, so $\theta\sigma(G)\subseteq S_n$ , so $\theta\sigma(G)$ is some permutation group, so we are done.

Define $\sigma(a)=g\mapsto ag$ (left-product with $a$ ). Since this map is invertible (by left-product with $a^{-1}$ ), it results in a permutation of elements, i.e. an element of $S_G$ .

$\sigma$ is a homomorphism, since applying left product of $b$ then left product of $a$ is the same as applying left product of $ab$ .

$\sigma$ is obviously one-to-one from its definition, i.e. given two maps $g\mapsto ag$ and $g\mapsto bg$ , if they are equal we must have $a=b$ .

Then $\sigma$ is an injective homomorphism, so we are done.
The implication of Cayley’s theorem is: to study the groups of order $n$ , we need only study the symmetric group $S_n$ .

Motivating example: $\forall g\in C_n\ldotp g^n=1$ . Proof. For every $g^k\in C_n$ , $(g^k)^n=(g^n)^k=1^k=1$ .

This proof is simple but required knowledge of cyclic groups, namely that every $g\in C_n$ can be represented as g^k for some $k$ . But using only the fact that $C_n$ is finite, we can ignore everything about cyclic groups and use Cayley’s theorem to reduce the proof to one about permutation groups.

Proof. Since $C_n$ is finite, by Cayley’s theorem it is isomorphic to a permutation group $S_{C_n}$ . We know that $\forall\sigma\in S_{C_n}\ldotp\sigma^n=e$ . Taking the image of this theorem via the isomorphism gives us $\forall g\in C_n\ldotp g^n=1$ .

What kinds of problems can we reduce to permutation group problems? Any problem involving finite groups! This may not always be a good move, since the permutation group can have up to $|S_n|=n!$ elements (a lot). But in general, we can now use theorems about permutation groups in any problem about finite groups.

In this section, we show how we can use groups to permute sets.

Given a group $G$ and a set $X$ , define the product $G\times X\to X$ as the left action of $G$ on $X$ . It takes an element $g$ and maps an element $x$ of $X$ to another element of $X$ .

The idea behind these group actions is that each element of $g$ permutes the elements of $X$ in some way, so that the product of two permutations (two elements of $g$ ) is like doing one permutation after the other, with $e$ being the identity permutation.

For example, let’s have $S_3$ act on elements of the set $\{1,2,3\}$ . The table might look like this:
```
          1 2 3
    (1) | 1 2 3
  (1 2) | 2 1 3
  (1 3) | 3 2 1
  (2 3) | 1 3 2
(1 2 3) | 2 3 1
(1 3 2) | 3 1 2
```
Here I picked an ordering of the elements in the set $\{1,2,3\}$ and arbitrarily named the permutations of $S_3$ .

Every (finite) group action is just a natural group action of permutation groups

In the abstract, where each element actually gets permuted to is immaterial – a group action just needs to behave like a permutation (identity, composition, inverses).

Let’s expand on this. Given an arbitrary group $G$ acting on an arbitrary set $X$ (both finite), we can use the table representation above (with elements of $G$ as rows and elements of $X$ as columns) to show that the action of $G$ on $X$ naturally turns out to be permutations, if successive group actions are to follow the composition and inverse laws.

TODO mention cayley tables somewhere with the following facts
- if the cayley table is symmetric about the main diagonal, then the group is abelian
- TODO
Example $G=\{e,a,b,c\}\iso K_4$ and $X={e,g,g^2}\iso C_3$ . We can let the action be anything (even trivial, fixing all of X) but let’s explore our choices. Draw the empty Cayley table, knowing that $e$ must be identity:
```
    e  g  g²
e | e  g  g²
a |
b |
c |
```
Each row must contain a permutation of $e,g,g^2$ . This is a natural consequence of the inverse law: every permutation must be reversible, so you can’t lose information by dropping one of those elements after a permutation.

In our example, since each element $a,b,c$ in $K_4$ is self-inverse (order $2$ ), we can only treat them as permutations looking like this:
```
        1 2 3
   () | 1 2 3
(1 2) | 2 1 3
(1 3) | 3 2 1
(2 3) | 1 3 2
```
after bijectively mapping each element of $X$ to a natural number, and mapping $\{a,b,c\}$ each to one of $(1~2),(1~3),(2~3)$ . (I picked the mapping arbitrarily here.)

We only need to define the action on some set of generators of $G$ , then the other rows can be derived by composition. Thus, the process for defining a group action works in general like this:
- Map each element $x\in X$ to a natural number
- Map each generating element $g\in G$ to a permutation with the same order as $g$
which falls out naturally because $g$ must follow the composition and inverse laws.

In this section, we visit the properties of group actions.

Recall the natural group action of $S_3$ acting on elements of the set $\{1,2,3\}$ .
```
          1 2 3
    (1) | 1 2 3
  (1 2) | 2 1 3
  (1 3) | 3 2 1
  (2 3) | 1 3 2
(1 2 3) | 2 3 1
(1 3 2) | 3 1 2
```
The permutations $\{(1),(2~3)\}$ fix $1$ , the permutations $\{(1),(1~3)\}$ fix $2$ , and {(1),(1~2)} fixes $3$ . Call these sets $G_1$ , $G_2$ , $G_3$ — they are the stabilizers of $1$ , $2$ , $3$ respectively. Concretely, the stabilizer of an element $x\in X$ is all the elements of $g\in G$ that fix $x$ .

The orbit of $x$ (denoted $O_x$ , sometimes $Gx$ ) is the set of all elements $x$ can get mapped to. Here, each of $1$ , $2$ , $3$ is able to reach all of $1$ , $2$ , $3$ , so for all $x$ , the orbit $O_x=X=\{1,2,3\}$ . Since everything in $X$ can reach everything in $X$ , there is only one orbit. When an action can take every element to every element like this, and therefore there is one orbit, we call it transitive.

For a non-transitive example, consider the action of $S_3$ on the power set $P(X)$ of $X=\{1,2,3\}$ :
```
          {} {1} {2} {3} {1,2} {1,3} {2,3} {1,2,3}
    (1) | {} {1} {2} {3} {1,2} {1,3} {2,3} {1,2,3}
  (1 2) | {} {2} {1} {3} {1,2} {2,3} {1,3} {1,2,3}
  (1 3) | {} {3} {2} {1} {2,3} {1,3} {1,2} {1,2,3}
  (2 3) | {} {1} {3} {2} {1,3} {1,2} {2,3} {1,2,3}
(1 2 3) | {} {2} {3} {1} {2,3} {1,2} {1,3} {1,2,3}
(1 3 2) | {} {3} {1} {2} {1,3} {2,3} {1,2} {1,2,3}
```
You can see that $\{1\}$ maps only to $\{\{1\},\{2\},\{3\}\}$ , etc. In this example, the orbits of each element in $P(X)$ consists of all the sets with the same number of elements.
Theorem: Orbits of a group action of $G$ on $X$ form equivalence classes in $X$ .

Define the relation $\sim$ on $X$ where $x\sim y$ whenever $x,y$ are in the same orbit. Since it just checks set inclusion, $\sim$ trivially satisfies the properties of an equivalence relation.

Since $\sim$ is an equivalence relation, it partitions $X$ into equivalence classes.
Corollary: Elements of $G$ act on $X$ by permuting each orbit separately.

Theorem: A group action of $G$ on $X$ is transitive iff for every $x,y\in X$ , $y=gx$ for some $g\in G$ .

When $y=gx$ for every pair $x,y\in X$ , it is the same as saying that every $x$ can reach every $y$ via $G$ , i.e. the action is transitive.
Theorem: For a transitive group action on $X$ , the action of a normal subgroup on $X$ gives orbits that all have the same cardinality.

Since the group action is transitive, let $y=gx$ for some fixed $x,y\in X$ . Then: $\begin{aligned} &Ny\\ =&Ngx&\text{ because }y=gx\\ =&gNx&\text{ because }N{ is normal}\\ =&g(Nx) \end{aligned}$ shows that left-multiplication by $g$ is a map between arbitrary orbits $Ny$ and $Nx$ . It is a bijection since the inverse is left-multiplication by $g^{-1}$ .

Since there is a bijection between arbitrary orbits $Nx,Ny$ they must all have the same cardinality.
The stabilizer of some $x\in X$ (denoted $G_x$ ) is a subgroup of $G$ composed of every $g$ that permutes that $x$ to itself. Similarly, the normalizer of a subgroup $H\le G$ (denoted $S_x$ ) is the subgroup of $G$ composed of every $g$ that permutes $H$ to itself.
Theorem: For a transitive group action on $X$ , a subgroup $H$ acts transitively on $X$ iff $G=HG_x$ for some $x\in X$ .

( $\to$ )

If a subgroup acts transitively on $X$ it means it sends every element of $X$ to every element of $X$ . That is, $X=Hx$ for every $x\in X$ .

Fix some $x\in X$ and some $g\in G$ , giving us an element $y=gx$ . Since the group action is transitive, we may write $y=hx$ for some $h\in H$ . Since this $h^{-1}gx=x$ we have $h^{-1}g\in G_x$ for arbitrary $h\in H$ , so $g\in H G_x$ . Since this is true for every $g$ , we know that $G=H G_x$ .

( $\from$ )

To prove $H$ acts transitively on $X$ , we need only prove $X=Hx$ for arbitrary $x\in X$ .

Since the group action is transitive, $Gx=X$ . But we are given $G=HG_x$ , so $X=H G_x x=Hx$ .
Corollary: Proper subgroups that contain the stabilizer of any element cannot act transitively.

Intuitively this makes sense – if a subgroup contains a stabilizer for $x\in X$ then it fixes $x$ , and therefore doesn’t take $x$ to every other element of $X$ .

Note that $HG_x=H$ when $G_x\subseteq H$ . So if $H$ acts transitively on $X$ but contains any stabilizer $G_x$ for some $x\in X$ , then you can show $G=HG_x=H$ : the only possible $H$ acting transitively on $X$ is $G$ itself.
Orbit-Stabilizer Theorem: Given $x\in X$ , $|O_x|=|G|/|G_x|$ .

Fix some $x\in X$ . To find the size of the orbit $O_x$ , we note that $\begin{aligned} &gx=hx\\ \iff &x=g^{-1}hx\\ \iff &g^{-1}h\in G_x \end{aligned}$

Define an equivalence relation where $g\sim h$ iff $g^{-1}h\in G_x$ . Then $gx=hx$ whenever $g,h$ are in the same equivalence class under $\sim$ . This means each equivalence class corresponds to a distinct element $gx$ that $x$ maps to, i.e. an element of its orbit $O_x$ . Then the number of equivalence classes is the number of elements in the orbit, $|O_x|$ .

This equivalence relation $\sim$ is the same as used in the proof of Lagrange’s theorem, so reusing the same argument in that proof, the equivalence classes all have the same size $|G_x|$ which divides $|G|$ . Therefore the number of equivalence classes is $|G|/|G_x|$ .

Therefore, number of equivalence classes $=|O_x|=|G|/|G_x|$ .
In this section, we study two possible actions of groups on themselves.

Consider a group acting on itself: $(g,h)\mapsto gh$ .

Theorem: The action of a group on itself is transitive.

An action of $G$ on $X$ is transitive iff for every $x,y\in X$ , $y=gx$ for some $g\in G$ . In this case both $x,y\in G$ , so we can let $g=yx^{-1}$ to make the above equation always true.
Theorem: The action of a group on itself has trivial stabilizers.

An element $g$ is in the stabilizer of $x$ iff $gx=x$ .

In this case, $x$ is an element in $G$ , so we right-multiply by the inverse $x^{-1}$ to get $g=e$ .

Then the only stabilizer of any $x\in G$ is the identity element, so all stabilizers are trivial.
Now consider a finite group $G$ acting on itself by conjugation: $(g,h)\mapsto ghg^{-1}$ . This is different from the left-multiplication definition of group action we had earlier. In fact, a group action doesn’t have to be finite or necessarily permute the elements – it just needs to be faithful, meaning there is only one identity in $G$ , and it needs to follow the group laws. An action is faithful if the intersection of all the stabilizers $G_x$ for $x\in X$ is identity $\{e\}$ .
Theorem: The action of a group on itself via conjugation is faithful iff the center is trivial.

For a conjugation group action of $G$ on itself, an element $g$ is in the intersection of all stabilizers iff for every element $h\in G$ , $ghg^{-1}=h$ . In other words, $g$ commutes with every element $h$ .

So if the center is trivial, the intersection of all stabilizers is trivial, so the action is faithful.
Theorem: The orbits of a group acting on itself by conjugation are exactly its conjugacy classes.

The orbits of a conjugation group action of $G$ on itself are, for each element $h\in G$ , all the elements $ghg^{-1}$ for $g\in G$ . But this is exactly the definition of the conjugacy classes of $h$ .

A centralizer of $g$ (denoted $C_G(g)$ ) is the subgroup of $G$ that commutes with $g$ .

Theorem: The stabilizer of $x$ in a group acting on itself by conjugation is the centralizer of $x$ .

For a conjugation group action, an element $g$ is in the stabilizer of $x$ iff $gxg^{-1}=x$ , i.e. $g$ commutes with $x$ . But this is exactly the definition of the centralizer of $h$ .
Theorem: The singleton orbits of a group acting on itself by conjugation are exactly the central elements of the group.

( $\to$ ): if $b$ is a conjugate of $a$ then $b=gag^{-1}$ for some $g$ . But if either $a$ or $b$ is in the center (say $ag=ga$ ), then $b=agg^{-1}=a$ . Since $b=a$ when either is in the center, elements in the center end up in singleton conjugacy classes.

( $\from$ ): if $a$ is in a singleton conjugacy class, $gag^{-1}=a$ for all $g$ , which means $ga=ag$ for all $g$ , so $a$ is in the center.
Theorem: The action of a group on itself by conjugation is not transitive, unless the group is trivial.

A group action is transitive if there is only one orbit. But since identity is always central and therefore has a singleton orbit, there are orbits outside the identity orbit, so long as the group is not trivial.

We can define at the class equation of a group action. For a group action of $G$ on $X$ , given each $x_i$ as a representative of each non-singleton orbit $O_{x_i}$ , and $|F|$ the number of fixpoints (singleton orbits), we have $|X|=|F|+\sum_i |G|/|G_{x_i}|$

Here’s how we arrive at the class equation:
- If you take the union of all orbits, you get the set $X$ .
- Orbits are disjoint, so you can split $X$ into the set of singleton orbits (fixpoints) $F$ and all non-singleton orbits $O_{x_i}$ to get $|X|=|F|+\sum_i |O_{x_i}|$ .
- By orbit-stabilizer theorem, we know $|O_x|=|G|/|G_x|$ .
- Therefore, $|X|=|F|+\sum_i |G|/|G_{x_i}|$ .
In particular, if the group action in question is the conjugation action of a group on itself, we can recover the original class equation of a group: $|G|=|Z(G)|+\sum_i|G|/|C_G(g_i)|$

misc notes

Class Equation Let G be a finite group and G ~ G by conjugation: y - x := yxy~! » the orbits are the conjugacy classes » the stabilizers are centralizers Gx = {y in G | yxy^-1} = Z_G(x) » the singleton orbits are the elements of the center Z(G)={z in G | zg=gz forall g in G} Class Eqn: If xᵢ are representatives all non-central conjugacy classes in G then |G| = |Z(G)| + ∑ᵢ [G: Z_G(xᵢ)]

Theorem: In $S_n$ , conjugating a permutation $\sigma$ by another permutation $\tau$ is the same as decomposing $\sigma$ into cycles and applying $\tau$ to each cycle element.

Consider an arbitrary part of a cycle $(\ldots~a~b~\ldots)$ in $\sigma$ — in other words, $\sigma(a)=b$ . But then $\tau\sigma\tau^{-1}(\tau(a))=\tau\sigma(a)=\tau(b)$ shows that the corresponding part of the cycle in $\tau\sigma\tau^{-1}$ is $(\ldots~\tau(a)~\tau(b)~\ldots)$ .

Corollary: Permutations of the same cycle type are in the same conjugacy class. Permutations of different cycle types are in different conjugacy classes.
Theorem: The conjugacy classes of $S_n$ may split into $2$ conjugacy classes in its subgroup $A_n$ .

Consider the conjugation action on $S_n$ . By the orbit-stabilizer theorem, we have $|O_\sigma|=\frac{|S_n|}{|{S_n}_\sigma|}$ for an arbitrary permutation $\sigma$ in $S_n$ .

The stabilizer ${S_n}_\sigma$ may or may not contain odd permutations.

If it contains no odd permutations, then it is completely contained in $A_n$ , and so we have $|O_\sigma|=\frac{|A_n|}{|{A_n}_\sigma|}=\frac{|S_n|/2}{|{S_n}_\sigma|}$ halving the orbit (the conjugacy class) of $\sigma$ .

If it contains some odd permutation, then note that ${A_n}_\sigma=A_5\cap{S_n}_\sigma$ . Is it the case that this implies ${A_n}_\sigma$ is half of ${S_n}_\sigma$ ? In that case, we have $|O_\sigma|=\frac{|A_n|}{|{A_n}_\sigma|}=\frac{|S_n|/2}{|{S_n}_\sigma|/2}$ , meaning the orbit (the conjugacy class) stays the same.

Therefore, conjugacy classes in $S_n$ may or may not split into $2$ conjugacy classes in $A_n$ .
Theorem: $A_5$ is simple.

$A_5$ consists of even permutations (formed by an even number of transpositions.) Such permutations are either a $3$ -cycle $(a~b~c)$ , a $5$ -cycle $(a~b~c~d~e)$ , or a product of two disjoint $2$ -cycles $(a~b)(c~d)$ .

To count the conjugacy classes in $A_5$ , we first count these permutations’ conjugacy classes in $S_5$ , since in $S_n$ permutations of the same cycle type are conjugate to each other.

$3$ -cycles: two ways to arrange three elements, of which you have ${5\choose 3}$ choices. $2{5\choose 3}=20$

$5$ -cycles: for each of $5!$ choices of five elements, rotations of the cycle lead to the same cycle, so there are $\frac{5!}{5}=4!=24$

Cycle type $[2,2,1]$ : There are ${5\choose 2}$ choices for each $2$ -cycle, and order doesn’t matter, so there are $\frac{{5\choose 2}{5\choose 2}}{2}=15$ of these.

Identity: The identity element is always in its own conjugacy class.

This accounts for all of the $\frac{5!}{2}=60$ permutations of $A_5$ . Now in $A_5$ some conjugacy classes may split into two, because we now lack odd permutations to conjugate by. Let’s assume that the class sizes $20,24,15$ split into $10,10,12,12,15$ . (In reality, the size $20$ class of $3$ -cycles doesn’t split.)

Normal subgroups are unions of conjugacy classes that include the identity permutation. Given that the sizes of the non-identity conjugacy classes are at worst $10,10,12,12,15$ , the possible normal subgroup orders are $1,11,13,16,21,23,25,28,33,36,38,40,45,60$ .

By Lagrange’s theorem, the order of a subgroup must divide the order of the group. Since $|A_5|=\frac{5!}{2}=60$ , the only possible subgroup orders are its divisors $1,2,3,4,5,6,10,12,15,20,30,60$ .

The only common orders of the above two lists are order $1$ (the trivial group) and $60$ ( $A_5$ itself). So they are the only normal subgroups, and thus $A_5$ is simple.

November 17, 2023. Exploration 6: Groups as matrices
Questions:
- TODO
Group representation examples

Recall that we can represent the complex numbers $\CC$ with matrices:

$\begin{aligned} 1&=\left[\begin{array}{ccc}1&0\\0&1\end{array}\right]\\ i&=\left[\begin{array}{ccc}0&1\\-1&0\end{array}\right]\\ a+bi&=\left[\begin{array}{ccc}a&b\\-b&a\end{array}\right]\\ \end{aligned}$

This works because $i^2=-1$ , and the matrices for $1$ and $i$ form an independent basis of the $\RR^2$ plane. It is one of many unique representations of the complex numbers using matrices.

In more formal terms, this defines a homomorphism $R:\CC\to GL_2(\RR)$ which we call the matrix representation of $\CC$ on $\RR^2$ . In this case the dimension of the representation is $2$ – it takes $2$ by $2$ matrices for this representation to represent the complex numbers

Another example: the defining representation $R:D_n\to GL_2(\CC)$ of the dihedral group $D_n$ says

$\begin{aligned} \left[\begin{array}{ccc}\zeta&0\\0&\zeta^{-1}\end{array}\right]&\text{ is rotation}\\ \left[\begin{array}{ccc}0&1\\1&0\end{array}\right]&\text{ is reflection} \end{aligned}$

where $\zeta=\exp(\frac{2\pi i}{n})$ .

Another: the sign representation sign : Sₙ → GL₁(F) of the symmetric group Sₙ takes the sign of each permutation. Since the dimension is 1, it is a linear representation.

Another: All complex linear representations Cₙ → GL₁(ℂ) = ℂˣ of the cyclic group Cₙ can be given as some mapping from Cₙ to one of {ζ⁰,ζ¹,…,ζⁿ⁻¹}. In fact, there are n of them.

Another: One way to represent ℤ is as matrices [1 n; 0 1], so the product of two matrices is equal to addition on the top right. It’s a representation ℤ → GL₂(F).
```
⎡1 n⎤⎡1 m⎤ = ⎡1 n+m⎤
⎣  1⎦⎣  1⎦   ⎣    1⎦
```
Last one:

The obvious action of a group G on a vector space V is the permutation action, where we view G as a bunch of permutations (see group actions exploration).

This is exactly related to how the obvious representation of a group G is the representation of G as a bunch of permutations.

In fact, the operation of G on V is equivalent to the representation of G on V.

This is the permutation representation of G.

Matrix representations are invariant under conjugation

It is a fact that change-of-basis of a linear transformation gives the same linear transformation. What if you do that for representations? If two matrices are similar, you can always find an isomorphism from a matrix A to a matrix B via change-of-basis: B = PAP⁻¹. Since you always get some matrix P, it should still have all the structure it needs to represent the group G, just in a different basis.

So matrix representations are invariant under conjugation.

Representations in general

The general construction uses ρ : G → GL(V) to denote a representation of G on V. This generalizes ℂ to some group G, and ℝⁿ to some arbitrary vector space V. Given a representation ρ, let ρ_g = ρ(g) be the image of g over ρ (so ρ₁ = I always, as homomorphisms preserve identity).

A matrix representation is just one where V is the space of column vectors Eⁿ. All representations of G on finite dimensional V can be reduced to matrix representations by choosing a basis for the space. However, all representations are equivalent up to conjugation (change-of-basis), so we need to study representations without regard to a specific basis. That’s why the other definition is used.

Isomorphic representations

The map χ: G → F given by χ(g) = tr(Rg) just sums the diagonal of the matrix for Rg.

Properties of representations

Every matrix representation R of a finite group G is conjugate to a unitary representation (find it via conjugation).

It’s true that for every matrix representation, there is a conjugate unitary representation. This is because each Rg generates a subgroup of GLₙ, which is conjugate to a subgroup of Uₙ.

November 28, 2023. Exploration 7: Sylow theory
Questions:
- TODO
Recall Cauchy’s theorem, which tells you that each prime divisor of $|G|$ corresponds to a subgroup of that order. But those are far from the only subgroups of $G$ . How do we get at the other subgroups of $G$ ? That’s where Sylow theory comes in.

Recall that a $p$ -subgroup is a subgroup of prime power order $p^n$ . Sylow’s theorems are all about identifying the $p$ -subgroups of a given group. Here is the first theorem, which can be seen as a generalization of Cauchy’s theorem:
First Sylow Theorem: A finite group $G$ contains a $p$ -subgroup of order $p^k$ for every prime power divisor $p^k$ of $|G|$ .

We proceed by strong induction on $|G|$ . The base case is simple: if $|G|=p$ , then $G$ itself is our required subgroup of order $p^1$ .

Otherwise, given that $p^k$ divides $|G|$ , we must show that $G$ contains a subgroup of order $p^k$ . The inductive hypothesis will let us claim that any group smaller than $G$ whose order is divisible by some prime power $p^j$ , has a subgroup of order $p^j$ .

First, recall that all groups satisfy the class equation: $|G|=|Z(G)|+\sum_i|G|/|C_G(g_i)|$ Given that $p^k$ divides the LHS, $|G|$ , it must divide the RHS as well. Then $p$ divides the RHS, and by properties of divisibility, $p$ either divides both $|Z(G)|$ and the sum $\sum_i|G|/|C_G(g_i)|$ , or it doesn’t divide either of them.

If $p$ divides $|Z(G)|$ , then by Cauchy’s theorem, $Z(G)$ contains an element of order $p$ . This element $a\in Z(G)$ , being central, must generate a normal subgroup of $G$ . That means we can consider the quotient $G/\<a\>$ .

Since $p^k$ divides $|G|$ and $|\<a\>|=p$ , it follows that $p^{k-1}$ divides $|G/\<a\>|$ . By inductive hypothesis, $G/\<a\>$ must contain a subgroup $H$ of order $p^{k-1}$ .

But since every coset in $G/\<a\>$ contains the same number of elements (namely $|\<a\>|=p$ elements) this means the union of the $p^{k-1}$ cosets that make up $H$ within $G$ form a subset of $G$ of size $p^k$ .

Since $H$ is a subgroup, this subset is the required subgroup of order $p^k$ .

Otherwise, if $p$ fails to divide $\sum_i|G|/|C_G(g_i)|$ , then at least one of the $|G|/|C_G(g_i)|$ in the sum must be indivisible by $p$ .

If $|G|$ is divisible by $p^k$ but $|G|/|C_G(g_i)|$ is indivisible by $p$ , it follows that $|C_G(g_i)|$ contains all the factors $p$ of $|G|$ and is therefore divisible by $p^k$ .

Since the order of $C_G(g_i)$ is divisible by $p^k$ , we’d like use the inductive hypothesis on $C_G(g_i)$ (because centralizers are always subgroups of $G$ ) to immediately show that it must contain a $p$ -subgroup of order $p^k$ , as required. But to use the inductive hypothesis, we need $|C_G(g_i)|<|G|$ .

We can show $|C_G(g_i)|<|G|$ by showing that $|G|/|C_G(g_i)|>1$ . We show this by showing $|G|/|C_G(g_i)|=1$ is false — it would imply a singleton conjugacy class, but in the class equation, singleton conjugacy classes are counted in $|Z(G)|$ , not the sum $\sum_i|G|/|C_G(g_i)|$ . Thus $|G|/|C_G(g_i)|\ne 1$ , so $|G|/|C_G(g_i)|>1$ .

In summary, if there isn’t a central element of order $p$ that lets us inductively find a subgroup of order $p^k$ , there must be a centralizer that contains a subgroup of order $p^k$ .
If there is a $p$ -subgroup for every prime power $p^k$ that divides $|G|$ , it follows that there is a maximal $p$ -subgroup, in the sense that no other $p$ -subgroups properly contain it. The order of this maximal subgroup, known as a Sylow $p$ -subgroup, must be the largest $p^k$ that divides $|G|$ . Sylow $p$ -subgroups need not be proper (it can be the whole group). The key property of Sylow $p$ -subgroups that we like to use is the following:

Theorem: Any $p$ -subgroup containing a Sylow $p$ -subgroup $P$ must be equal to $P$ .

By definition, no other $p$ -subgroup properly contains a Sylow $p$ -subgroup. Thus if a $p$ -subgroup contains a Sylow $p$ -subgroup, it must be improperly contained, i.e. the two subgroups are equal.

Sylow $p$ -subgroups need not be unique: there can be many different Sylow $p$ -subgroups in a given group. The first Sylow theorem implies that Sylow $p$ -subgroups always exist. So how many are there? This leads us to the second Sylow theorem:

Second Sylow Theorem: The number of Sylow $p$ -subgroups contained in a finite group $G$ is equivalent to $1\bmod p$ .

Lemma: Let $P$ be a Sylow $p$ -subgroup, and let $g\in G$ be an element of order $p^k$ . Then $gPg^{-1}=P$ implies $g\in P$ .

Assuming $gPg^{-1}=P$ (i.e. $g$ is in the normalizer of $P$ ), that makes $P$ invariant under conjugation under $g$ . Since $P$ is already invariant under conjugation by its own elements, $P$ is invariant under conjugation by both $g$ and $P$ , making it a normal subgroup of $R=\<g,P\>$ . Then the quotient $R/P$ , being generated by $gP$ , must be cyclic, and it must be a $p$ -group since $g$ has order $p^k$ . Then by $|R|=|R/P|\cdot |P|$ , the order of $R$ is the product of the order of two $p$ -groups, making $R$ a $p$ -group itself. But by containing the Sylow $p$ -subgroup $P$ , $R$ must equal $P$ . This implies $R=\<g,P\>=P$ , and therefore $g\in P$ .

Now let $\Syl_p(G)$ denote the set of all Sylow $p$ -subgroups of $G$ , and consider the conjugation action of $G$ on $\Syl_p(G)$ . That is to say, for every $g\in G$ and $P\in\Syl_p(G)$ , $gPg^{-1}$ is another Sylow $p$ -subgroup in $\Syl_p(G)$ . Let $P,Q$ be two Sylow $p$ -subgroups of $\Syl_p(G)$ . Consider the $P$ -orbit $O_P$ of $Q$ under this action (all the subgroups obtainable by conjugating $Q$ by elements of $P$ ). By the orbit-stabilizer theorem, we have: $|O_P|=\frac{|P|}{|Stab_P(Q)|}$ Here, $|O_P|$ can only be a power of $p$ since its factors come from the order of the $p$ -subgroup $|P|=p^k$ . Note that if $p^0=1$ , that would imply the orbit is $\{Q\}$ and that $Q$ is invariant under conjugation by all elements $g\in P$ . But by the lemma, if $Q$ is invariant under conjugation by an element $g$ of order $p^k$ , then $g\in Q$ . Since this is true for all $g\in P$ , we have $P\subseteq Q$ , and since subgroups containing Sylow $p$ -subgroups are equal, $P=Q$ . Therefore there is only one orbit of order $p^0=1$ , and all the others have order a power of $p$ .

Since conjugation by $g\in G$ is always a transitive group action, every Sylow $p$ -subgroup in $\Syl_p(G)$ can be obtained by conjugating $Q$ by elements in $G$ . Thus $\Syl_p(G)$ can be considered the union of all $P$ -orbits of $Q$ . Again, there is one orbit of order $1$ among all other orbits of order a power of $p$ , so we have $|\Syl_p(G)|\equiv 1\bmod p$ .

Recall the Fundamental Theorem of Finite Abelian Groups, which states that every finite abelian group $G$ can be decomposed into a direct product of $p$ -groups: cyclic groups of prime power order.

Now consider the non-abelian groups.

In this section, we describe the third Sylow theorem.

Sylow theory deals with the structure of finite groups, and hinges on the Sylow theorems that help immensely with proving groups solvable.

Define a Sylow $p$ -subgroup as a maximal $p$ -subgroup of $G$ (it has prime power order and not contained in any other $p$ -subgroup.)
Third Sylow Theorem: The number of Sylow $p$ -subgroups $n_p$ is congruent to $1\text{ mod }p$ . If the group is order $p^n q$ , then $n_p$ divides $q$ .

First part:

Let $S$ be the set of all Sylow $p$ -subgroups of $G$ , which necessarily have the same order $p^n$ since they are maximal.

The conjugate of a subgroup is a subgroup of the same order. Then conjugating the Sylow $p$ -subgroups in $S$ permutes that set.

Conjugating the subgroups in $S$ by $g$ will fix the Sylow $p$ -subgroups that contain $g$ , because subgroups are closed under group product.

Conjugating the subgroups in $S$ by $P$ will fix only $P$ itself (since all Sylow $p$ -subgroups are maximal, only $P$ contains elements only in $P$ .) This means all other Sylow $p$ -subgroups will be permuted to another Sylow $p$ -subgroup.

But then conjugating by $P$ permutes $S\setminus\{P\}$ with no fixed points.

Each element of $P$ generates a cyclic subgroup $H$ , which by Lagrange’s Theorem, divides $|P|=p^n$ .

Furthermore, the cycle length $k$ of this permutation must divide the order of that cyclic subgroup, because conjugating by $H$ $k$ times must lead you back to the same subgroup. Therefore, every cycle in this permutation must be a power of $p$ .

The fact that there are no fixed points implies that there are no cycles of length $1$ , and therefore every cycle in the permutation is divisible by $p$ .

If a permutation on a set has only cycles divisible by $p$ , then the size of the set is divisible by $p$ . This means $|S\setminus\{P\}|=|S|-1$ is divisible by $p$ , which implies $|S|\equiv 1\text{ mod }p$ .

Second part:

Conjugation by an arbitrary element $g\in G$ is a permutation of $S$ .

Define the homomorphism $f:G\to S_{|S|}$ , which maps elements of $G$ to the corresponding permutation in the symmetric group of $S$ , $S_{|S|}$ .

The kernel $K$ of this homomorphism is a normal subgroup of $G$ containing all the elements of $g$ that fix $S$ via conjugation. (i.e. they lead to the identity permutation in $S_{|S|}$ .) Then elements of $G/K$ represent all the distinct permutations of $S$ by conjugation of some element in $G$ .

But since there are exactly $|S|$ distinct Sylow $p$ -subgroups reachable by such permutations, the number of distinct permutations divide $|S|$ , so $|G/K|$ divides $|S|$ .

Since $|G|=|G/K||K|$ , we know that $|G|=(k|S|)|K|$ for some $k$ . We can rearrange this into $|K|=\frac{|G|}{k|S|}$ or $|K|=\frac{p^nq}{k|S|}$ .

Since $|S|\equiv 1\text{ mod }p$ , we know $|S|$ doesn’t contain any factor $p$ , therefore $|S|$ divides $q$ .
An immediate application arises from the Third Sylow Theorem. First, $n_p$ divides $|G|/p^n$ where $p^n$ is the highest power of $p$ dividing $|G|$ . If you look at the divisors of $|G|/p^n$ and find that $1$ is the only divisor congruent to $1~\text{mod }p$ , then that means $n_p=1$ , so for the given $p$ there is only one Sylow $p$ -subgroup $P$ . But subgroups of unique order are normal, meaning you can take $G/P$ to reduce the order of the group, and if the resulting group is solvable then $G$ is solvable.

Here’s some specific applications of that idea:
Theorem: Every group with order $pq$ ( $p,q$ primes) is solvable.

Assume that $p>q$ . (If $p=q$ then it’s a prime power order, therefore solvable.)

Third Sylow Theorem: The number of Sylow $p$ -subgroups $n_p$ is congruent to $1\text{ mod }p$ . If the group is order $p^n q$ , then $n_p$ divides $q$ .

In this case, since $n_p$ divides the prime $q$ it must be either $1$ or $q$ . It cannot be $q$ because $n_p\equiv 1(\text{ mod } p)$ but $p>q$ . Therefore $n_p=1$ .

Let $G_1$ be that single Sylow $p$ -subgroup. Lagrange’s Theorem says the order of every subgroup divides the order of the group $pq$ . We know that $p$ -subgroups have prime power order $p^k$ , so the order of $G_1$ must be $p$ .

$G_1$ is the only $p$ -subgroup, so it is the only order $p$ subgroup. Recall that subgroups with unique order are normal. Therefore $G_1\lhd G$ .

Since $p$ -groups like $G_1$ are solvable, $G_1\lhd G$ indicates that $G$ is solvable.
Theorem: Every group $G$ with order $p^nq^m$ ( $p,q$ primes) where none of $q,q^2,\ldots,q^n$ are congruent to $1\text{ mod }p$ is solvable.

Third Sylow Theorem: The number of Sylow $p$ -subgroups $n_p$ is congruent to $1\text{ mod }p$ . If the group is order $p^n q$ , then $n_p$ divides $q$ .

Since $n_p$ divides $q^m$ it must be a power of $q$ between $1$ and $q^m$ .

But $n_p$ is congruent to $1\text{ mod }p$ .

Since none of $q,q^2,\ldots,q^n$ are congruent to $1\text{ mod }p$ , $n_p$ must be $1$ – there is only one Sylow $p$ -subgroup, $P$ .

Recall that subgroups with unique order are normal. Therefore $P$ is normal. Since $P$ has prime power $p^n$ it is solvable. So we have $P\lhd G$ and therefore $G$ is solvable.
And here’s the general application:
Theorem: A group $G$ has a normal Sylow $p$ -subgroup when $|G|/p^n$ ( $p^n$ being the highest power of $p$ dividing $|G|$ ) has no nontrivial divisors congruent to $1\text{ mod }p$ .

TODO
divisors of 12 => 1 mod 5 == 1 iff
Theorem: Every group with order < 60 is solvable.

Every integer between $1$ and $59$ is either prime, a prime power, or the product of two primes, except for the following:

┌──┬─────────┐ │# │primes │ ├──┼─────────┤ │12│2 2 3 │ │18│2 3 3 │ │20│2 2 5 │ │24│2 2 2 3 │ │28│2 2 7 │ │30│2 3 5 │ │36│2 2 3 3 │ │40│2 2 2 5 │ │42│2 3 7 │ │44│2 2 11 │ │45│3 3 5 │ │48│2 2 2 2 3│ │50│2 5 5 │ │52│2 2 13 │ │54│2 3 3 3 │ │56│2 2 2 7 │ └──┴─────────┘

Most of these are of order $p^nq^m$ for prime $p,q$ , and so we need only prove either $p,p^2,\ldots p^n\not\equiv 1\text{ mod }q$ or $p,p^2,\ldots p^n\not\equiv 1\text{ mod }q$ to show that they are solvable:

┌──┬─────────┬────────────────────────────┐ │# │primes │ │ ├──┼─────────┼────────────────────────────┤ │12│2 2 3 │ 3 ≢ 1 (mod 4) │ │18│2 3 3 │ 2 ≢ 1 (mod 3) │ │20│2 2 5 │ 2,4 ≢ 1 (mod 5) │ │24│2 2 2 3 │ ? │ │28│2 2 7 │ 2,4 ≢ 1 (mod 7) │ │30│2 3 5 │ -- │ │36│2 2 3 3 │ ? │ │40│2 2 2 5 │ 2,4,8 ≢ 1 (mod 5) │ │42│2 3 7 │ -- │ │44│2 2 11 │ 2,4 ≢ 1 (mod 11) │ │45│3 3 5 │ 3,9 ≢ 1 (mod 5) │ │48│2 2 2 2 3│ ? │ │50│2 5 5 │ 2 ≢ 1 (mod 5) │ │52│2 2 13 │ 2,4 ≢ 1 (mod 13) │ │54│2 3 3 3 │ 2 ≢ 1 (mod 3) │ │56│2 2 2 7 │ ? │ └──┴─────────┴────────────────────────────┘

TODO

groups

In this section, we detail a parallel history of group theory.

In this section, we approach group theory without looking at big definitions.

In this section, we define and explore permutations.

In this section, we compose and invert permutations.

In this section, we decompose permutations.

In this section, we look at characterizations of the cyclic group CnC_nCn​.

In this section, we finally define groups as a generalization of the above.

In this section, we outline some properties of groups.

lemmas we need

In this section, we look at elements that commute with everything.

In this section, we learn about subsets that commute with everything.

In this section, we discuss equivalence relations.

In this section, we learn how to determine the size of conjugacy classes.

In this section, we learn about the relationship between conjugacy classes and central elements.

In this section, we learn about subgroups.

In this section, we make a group the least abelian it can be.

In this section, we formalize quotient groups.

In this section, we make a group abelian.

In this section, we review abelian and centerless groups.

In this section, we learn some more ways to find normal subgroups.

In this section, we measure the normality of a subgroup.

Summary

In this section, we try a different approach to constructing a group from given subgroups.

In this section, we study the semidirect product.

In this section, we learn how to decompose a handful of non-abelian groups.

In this section, we show one way to decompose non-abelian groups.

In this section, we learn how to calculate the order of elements.

In this section, we prove facts about subgroup order.

In this section, we determine the structure of groups based on their order.

In this section, we introduce the exponent of a group.

In this section, we define group homomorphisms.

In this section, we abstractly represent relationships between groups via homomorphisms.

In this section, we describe properties of homomorphisms with only diagrams.

todo

Composing permutations

In this section, we show how we can use groups to permute sets.

Every (finite) group action is just a natural group action of permutation groups

In this section, we visit the properties of group actions.

In this section, we study two possible actions of groups on themselves.

misc notes

Group representation examples

Matrix representations are invariant under conjugation

Representations in general

Isomorphic representations

Properties of representations

In this section, we describe the third Sylow theorem.

In this section, we look at characterizations of the cyclic group $C_n$ .