models
What model theory studies is all of these structures – groups, rings, fields, etc – from a bird’s eye view. It lets us prove cool results like:
- What kinds of properties are preserved by homomorphisms on any structure?
- Given a property, what does it take to show that a structure exists that satisfies the property?
- When are two structures equivalent in the sense that they satisfy the same set of properties?
Here’s my notes on model theory.
-
November 2, 2023.
Intro to model theory
Model theory is all about figuring out how certain structures (like groups, rings, and fields) satisfy certain properties (like commutativity, “x is prime”, and other statements).
To do this, we need to rigorously define what a property is. We start with atomic formulas. An atomic formula is a relation – either it’s equality:
- ϕ(x,y,z)≡x+y=z
- ϕ(x)≡x+x=0
- ϕ(x,y)≡x+y=y+x
or some other relation like ≤,⊆:
- ϕ(x,y,z)≡x+y≤z
- ϕ(X,Y,Z)≡X∪Y⊆Z
- ϕ(xˉ)≡R(xˉ) (for some relation R)
where all the symbols like +,∪,≤,⊆ come from some language L of symbols. A language consists of constants (like 0), function symbols (like +,∪), and relation symbols (like ≤,⊆,R).
For now, let’s only deal with atomic formulas, which means we won’t be using the symbols ∨,∧,¬,∃,∀.
Here we define A⊨ϕ(aˉ) (for atomic formulas ϕ(aˉ)).
One of the most important relations in model theory is A⊨ϕ(aˉ), read “ϕ(xˉ) is true of aˉ in A” or “aˉ satisfies ϕ(xˉ) in A” or simply “A models ϕ(aˉ)”. Here, A is an L-structure (it assigns meaning to symbols in L). For instance, we might say that (Z,+,=) is a structure that interprets the constant symbols 0,1,2, etc. as integers, the function symbol + as integer addition, and the relation symbol = as integer equality.
One way to see A⊨ϕ(aˉ) is:
- ϕ(xˉ) is some relation constructed purely of meaningless symbols in a language L
- The L-structure A interprets the language L so that the atomic formula ϕ(aˉ) is satisfied,
- for a given assignment of variables aˉ.
For instance, given ϕ(x,y)≡x+y=3, we have (Z,+,=)⊨ϕ(1,2).
To precisely define the relation ⊨ for atomic formulas, we describe two cases:
- Either ϕ is equality, t1=t2,
- or ϕ is some other relation symbol in L, R(t1,…,tn). For example, ⊆ is a relation symbol, in the sense that we can write ⊆(t1,t2) or more commonly, t1⊆t2.
where each ti(xˉ) is some term in our language L. Then, if we let tiA(xˉ) be A’s interpretation of the term ti(xˉ), A⊨ϕ(xˉ) is the same as saying
- (if ϕ is equality) that t1A(xˉ)=t2A(xˉ) (they are interpreted to be equivalent).
- (if ϕ is R(t1,…,tn)) that (t1A(xˉ),…,tnA(xˉ))∈RA, where RA is A’s interpretation of the relation R.
Here we define A⊨T (for theories T).
A theory is a set of sentences, i.e. formulas with no parameters. A sentence is either always false or always true under a given model — unlike formulas, which could be false given certain parameters and true given other parameters. A theory is either inconsistent when it contains two sentences that cannot be logically be true at the same time (ϕ and ¬ϕ for some ϕ), and consistent otherwise.
We can extend the definition of ⊨ to theories T. For now, assume that the theory T is a set of atomic sentences. Then if A⊨ every atomic sentence in T, we can write A⊨T. Note that inconsistent theories have no model, since it is impossible to satisfy both ϕ and ¬ϕ no matter the model.
An important property is that if A⊨T, then A⊨ every subset of T as well. We’ll see later that we often prove A⊨T by extending T to some T+ where A⊨T+ is easy to prove.
Summary
We say A⊨ϕ(aˉ), or “A models ϕ(aˉ)” when the L-structure A interprets the symbols in the atomic formula ϕ(aˉ) so that it is true. By “true” we mean that:
- if ϕ(xˉ) is an equality relation, that A interprets both sides of the equality as the same thing,
- if ϕ(xˉ) is some other relation R(tˉ) in L, that the assignment of aˉ to variables xˉ makes R(tˉ) true in A’s interpretation of the relation R.
⊨ can be extended to work on sets of sentences, or theories. In particular, if A⊨T, then A⊨ every subset of T as well.
-
November 2, 2023.
Exploration 1: The canonical model of a theory
The goal of this exploration is to show what it takes for a theory T to have a model.
Each section will prove that T has a model for increasingly wider classes of T, starting with T only containing atomic sentences, and ending with T containing first-order sentences.
In this section, we construct the canonical model for a set of atomic sentences T.
For now, we will only discuss theories T where T is a set of atomic sentences (atomic formulas with no variables xˉ). We’ll prove that there is always a model for T, the canonical model of T.
Outline:
- First, we simplify the problem to finding a model of T when T is “=-closed”.
- Later, we define how to make T =-closed by extending T to its “=-closure”.
- The canonical model of T is the model of the =-closure of T, which also models the subset T.
Let’s start. We’d like to extend our theory T to give it some properties that make it easier to find a model for it. Here are two such properties:
- (reflexivity) t=t is in T, for all closed terms t of L, and
- (congruence) s=t in T implies ϕ(s)∈T iff ϕ(t)∈T, for all closed terms s,t of L and atomic formulas ϕ(x) of L.
Intuitively, (congruence) says if two terms are equal, then you should be able to swap them in any sentence in T and the result will still be in T.
If a set of atomic sentences T has both (reflexivity) and (congruence), then it is =-closed (in L). So we’ll go ahead and extend our set of atomic sentences to its =-closure, i.e. its minimal =-closed extension.
Now let’s use these properties to find a model for T.
Building the canonical model
First of all, what is the underlying set? By definition of ⊨, we know that to model equalities in T, we must interpret both sides of the equality to be the same. An easy way to do this is to have the elements of A be equivalence classes where t1=t2∈T puts t1 and t2 in the same equivalence class.
We prove t1=t2∈T is an equivalence relation on ti. By (reflexivity) we already have reflexivity. By (congruence) if t1=t2 is in T then we can swap the terms to get t2=t1 also in T, so we have symmetry, and by a similar argument we have transitivity. So t1=t2∈T is a valid equivalence relation.
Therefore, the underlying set for our model shall be equivalence classes of closed terms, where t1 and t2 are equivalent if t1=t2∈T.
Since terms are now equivalence classes, our constant symbols must also be equivalence classes, and our function symbols and relation symbols all must operate on equivalence classes. We do this by lifting the relations R in L to take equivalence classes instead of terms.
Let’s show that this definition preserves the truth of relations in T. This requires showing that in L, whenever t1 and t2 are terms in the same equivalence class (of equality in T), R(t1) is the same as R(t2). But this is true because of (congruence). More generally, we must show R(sˉ)=R(tˉ) for all length-n sequences of closed terms sˉ and tˉ where corresponding si and ti are in the same equivalence class. But this is true by applying (congruence) n times. Thus, lifting the relation symbols from L to apply to equivalence classes is well-defined, and gives us a set of relation symbols for A.
What about the constants and function symbols of A? The constant elements c and function symbols F of L can be lifted from L into A by having them return equivalence classes of terms rather than single terms. We again apply (congruence) recursively to show F(tˉ1) and F(tˉ2) are in the same equivalence class, and so this is well-defined.
Summary
In summary, by defining this L-structure A on the =-closure of T, we’ve built the canonical model of T that, by construction, models our given set of atomic sentences T.
In this section, we prove that every model of T is isomorphic to the canonical model, assuming T is composed of atomic sentences.
First we define homomorphisms between L-structures. The idea of a homomorphism is that it set every symbol in an L-structure A to a corresponding symbol in another L-structure B (symbols may overlap).
An isomorphism is an invertible homomorphism. This necessarily means each unique symbol in A maps to a corresponding unique symbol in B, i.e. it’s a renaming.
We’d like to prove that the canonical model is unique – that any two canonical models M, N are equal. But models generally aren’t unique since you can always swap around the names of its elements to get an equivalent model. So we prove instead that the canonical model is unique up to isomorphism – any two canonical models are renamings of each other.
Suppose that M is the canonical model for T and N is some other arbitrary model of T. Define the homomorphism f:M→N that maps equivalence classes [t]∈M to the element tN∈N.
- We show that
f
is a homomorphism:
- (congruence) means all constants and functions are preserved under f.
- M and N both being models of T means all relations are preserved under f.
- Therefore the interpretation of each symbol in L is preserved under f.
- To show that f is well-defined, we need to show that whenever t1,t2 are in the same equivalence class, N must interpret t1 and t2 as the same element (t1N=t2N). But this is true because being in the same equivalence class means t1=t2∈T, so N⊨T implies N⊨t1=t2 implies t1N=t2N by definition of ⊨.
- The reverse shows that f is injective: whenever t1N=t2N, t1=t2∈T must be true by (reflexivity), and therefore t1,t2 are in the same equivalence class.
- f is also surjective since every element n∈N is an interpretation of some term t1 (since N⊨T), and therefore is mapped to by [t1]∈M.
Therefore f is an isomorphism. Since we’ve constructed an isomorphism from the canonical model M to an arbitrary model N of T, this means every model of T is isomorphic to the canonical model, i.e. the canonical model is unique up to isomorphism.
In this section, we show certain first-order theories are also modelled by the canonical model.
The language L we work with is often a subset of L∞ω (“infinitary logic”), an extension of first-order logic. The first symbol ∞ means that the maximum size of conjunctions ∧ and disjunctions ∨ in formulas is infinite. The second symbol ω means that the maximum number of quantifiers can be large but finite.
By taking all sentences in L∞ω true for a structure A, we obtain its first-order theory Th(A). So every structure has a first-order theory. Conversely, does every first-order theory have a model, like we showed before for atomic sentence theories? The answer turns out to be no, not every first-order theory has a model.
But by extending some atomic sentence theory T with certain first-order sentences, it’s possible to still have the same canonical model be a model of the extended T, as long as the resulting first-order theory maintains the following consistency properties:
- Consistency: ϕ∈T implies ¬ϕ∈T for atomic sentences ϕ
- Congruence: ϕ(s)∈T iff ϕ(t)∈T for atomic formulas ϕ and closed terms s,t of L∞ω, whenever s=t∈T
- Reflexivity: (t=t)∈T for every closed term t of L∞ω
- Double negation: ¬¬ϕ∈T implies ϕ∈T
- Closure under ∧: ⋀Φ∈T implies Φ⊆T
- Its inverse: ¬⋀Φ∈T implies ∃ψ∈Φ.¬ψ∈T
- Closure under ∨: ⋁Φ∈T implies ∃ψ∈Φ.ψ∈T
- Its inverse: ¬⋁Φ∈T implies ∀ψ∈Φ.¬ψ∈T
- Closure under ∀: ∀x.ϕ(x)∈T implies ϕ(t)∈T for every closed term t of L∞ω
- Its inverse: ¬∀x.ϕ(x)∈T implies ¬ϕ(t)∈T for some closed term t of L∞ω
- Closure under ∃: ∃x.ϕ(x)∈T implies ϕ(t)∈T for some closed term t of L∞ω
- Its inverse: ¬∃x.ϕ(x)∈T implies ¬ϕ(t)∈T for every closed term t of L∞ω
The motivation for these properties: #1 and #4 encode negation, #2 and #3 encode =-closure, #5/#6 and #7/#8 encode ∨ and ∧, #9/#10 and #11/#12 encode ∀ and ∃. These properties are known as consistency properties of T, because they ensure that T is consistent even under the various properties of our logical connectives. Such theories T satisfying consistency properties are called Hintikka sets (for L∞ω).
The 8th property (¬⋁Φ∈T implies ∀ψ∈Φ.¬ψ∈T) is more commonly known as the Henkin property, which is special for reasons we’ll get to soon.
But why do we care about this specific assortment of properties? And why are these properties enough to encode everything we need about their respective operations?
- To answer the first question: recall that we want the canonical model to fit T, and so we need to have properties on T that make sure that the canonical model remains a model of T after adding non-atomic sentences. Non-atomic sentences in the language L∞ω are precisely atomic sentences that are bound together by the logical connectives ¬,∨,∧,∃,∀. So we need to separately ensure that the properties of ¬,∨,∧,∃,∀ are held in the =-closure of T, recalling that the =-closure of T is the actual theory that we construct the canonical model on; see the previous section on the canonical model for atomic sentences.
- To answer the second question: recall that for the canonical model A to model T, we need to ensure that every sentence in T is in RA. When T had only atomic sentences (i.e. single relations), it was easy to map each to the corresponding relation in RA. But now that T has non-atomic sentences, we need to ensure that what we know in first-order logic holds in T as well. For example, if A models T and T contains ϕ and ψ, we know logically that A⊨ϕ∧ψ by definition of ⊨, but that’s not necessarily true in T unless we make it true by adding the corresponding property. By encoding the axioms of our logical connectives ¬,∨,∧,∃,∀ in this manner, we arrive at the consistency properties above. (The two properties for =-closedness, #2 and #3, correspond to the fact that T must be a =-closure, as mentioned in the previous bulletpoint.)
Hopefully that clears up the motivation for Hintikka sets; now we can say:
(Theorem 2.3.3) If T is a Hintikka set for L, the canonical model for (atomic sentences in T) models T.
Theorem 2.3.3 implies that any theory T that can be extended to a Hintikka set has a model (the canonical model). “Just extend it to a Hintikka set” is a lot to ask since there’s so many properties to fulfill. Luckily, all the properties above are true if the following smaller set of properties are true:
(Theorem 2.3.4) If
- (finite satisfiability) every finite subset of T has a model,
- (completeness) ϕ or ¬ϕ is in T but not both, for every sentence ϕ in L
- (Henkin property): every ∃x.ϕ(x) in T implies ϕ(t) in T for some closed term t of L
…then T is a Hintikka set for L.
If a theory T can be extended to get all three properties, then the extension is a Hintikka set, and therefore T has a model. Proof that these three properties imply a Hintikka set:
Two of the consistency properties follow immediately:
- (Completeness) implies (#1) consistency: ϕ∈T implies ¬ϕ∈T for atomic sentences ϕ.
- The (Henkin property) is exactly (#8) closure under ∃: ∃x.ϕ(x)∈T implies ϕ(t)∈T for some closed term t.
To derive the remaining consistency properties quickly, we first prove three short lemmas:
- (finite consistency) Every finite subset of
T
is consistent (no contradictions).
- Proof: Every finite subset of T has a model (finite satisfiability), and inconsistent theories have no model, so every finite subset of T cannot be inconsistent.
- (finite strong completeness) Every sentence logically
derivable
(⊢)
from a finite subset of
T
is in
T.
- Proof: Let U be a finite subset of T. If U⊢ϕ, then U∪{¬ϕ} is finite but inconsistent, and therefore fails to be a subset of T (finite consistency). This means ¬ϕ∈T, which by (completeness) implies ϕ∈T.
- (finite strong completeness corollary): if
{ϕ}⊢ψ,
then
ϕ∈T→ψ∈T.
- That is, if ψ follows from ϕ via deductions on logical operators, then we know that the property ϕ∈T→ψ∈T holds on T.
- Proof: if ϕ∈T then {ϕ} is a finite subset of T, so ψ∈T immediately follows from (finite strong completeness).
- This trivially extends for multiple hypotheses, {ϕ1,ϕ2,…}⊢ψ, as well as none, {}⊢ψ.
Since the majority of our consistency properties are in the form ϕ∈T→ψ∈T where {ϕ}⊢ψ, we can use the (finite strong completeness corollary) to immediately prove them.
- {}⊢(t=t)
proves reflexivity:
(t=t)∈T
for every closed term
t.
- (Reflexivity is an axiom in our language, so we can always derive it from nothing.)
- {¬¬ϕ}⊢ϕ (for every sentence ϕ) proves double negation: ¬¬ϕ∈T implies ϕ∈T.
- {∧Φ}⊢ϕ (for every ϕ in Φ) proves closure under ∧: ⋀Φ∈T implies Φ⊆T.
- {¬∨Φ}⊢¬ψ (for every ψ in Φ) proves inverse of closure under ∨: ¬⋁Φ∈T implies ∀ψ∈Φ.¬ψ∈T.
- {∀x.ϕ(x)]}⊢ϕ(t) (for every closed term t) proves closure under ∀: ∀x.ϕ(x)∈T implies ϕ(t)∈T for every closed term t.
- {¬∃x.ϕ(x)}⊢¬ϕ(t) (for every closed term t) proves inverse of closure under ∃: ¬∃x.ϕ(x)∈T implies ¬ϕ(t)∈T for every closed term t.
- {¬∀x.ϕ(x)}⊢∃x.¬ϕ(x) proves ∃x.¬ϕ(x)∈T. Then, using the (Henkin property) on ∃x.¬ϕ(x), we get inverse of closure under ∀: ¬∀x.ϕ(x)∈T implies ¬ϕ(t)∈T for some closed term t.
- {s=t,ϕ(s)}⊢ϕ(t) and {s=t,ϕ(t)}⊢ϕ(s) (for every atomic formula ϕ(x)) prove both sides of congruence: ϕ(s)∈T iff ϕ(t)∈T for atomic formulas ϕ(x) and closed terms s,t, whenever (s=t)∈T.
We’ve proved 10 out of 12 properties; the remaining ones are closure under ∨ and the inverse of closure under ∧. They both require showing some ψ∈Φ exists in T without specifying which ψ. Since we can’t derive (⊢) this unknown ψ, we can’t use (finite strong completeness) here. Instead, we’ll prove by contradiction – not having the property lets us construct an inconsistent finite subset of T, violating (finite consistency).
- Closure under
∨:
⋁Φ∈T
implies
∃ψ∈Φ.ψ∈T.
- Towards contradiction, assume ⋁Φ∈T but ¬∃ψ∈Φ.ψ∈T. Using (completeness), this implies ∀ψ∈Φ.¬ψ∈T.
- Φ is finite since we only allow finite conjunction/disjunction in first-order logic.
- But then T has an inconsistent finite subset {⋁Φ,¬ψ0,¬ψ1,…}, which contradicts (finite consistency). This finishes the proof.
- Inverse of closure under
∧:
¬⋀Φ∈T
implies
∃ψ∈Φ.¬ψ∈T.
- Let ¬Φ denote “Φ but with ¬ prepended to each ψ∈Φ”.
- {¬⋀Φ}⊢⋁¬Φ proves ¬⋀Φ∈T implies ⋁¬Φ∈T, by way of (finite strong completeness).
- Then, closure under ∨ gives ∃ψ∈Φ.¬ψ∈T, as required.
Theorem 2.3.4 tells you a set of minimum requirements to make an extension of T into a Hintikka set: every finite subset must have a model, it must be a complete theory, and it must be closed under ∃ (Henkin property). If you can extend T with these properties, you have a model for T. We will use this in the next section.
In this section, we show the absolute minimum requirement for a first-order theory to have a canonical model. (Compactness theorem)
Previously, we found that not all first-order theories have a model, but certain extensions of atomic sentences as first-order theories do have a model (the canonical model) if they can be extended to get a Hintikka set: specifically, if (1) every finite subset of T has a model, and T is (2) complete and (3) closed under ∃, then it has a model.
The following theorem shows that (2) and (3) are unnecessary; (1) alone is enough to find a model for first-order T, again by taking the canonical model of a larger (cleverly-crafted) theory. This fundamental result of model theory is known as the compactness theorem for first-order logic, and is the more general way to find when a given first-order theory has a model.
The compactness theorem: If every finite subset of T has a model, then T has a model. (where T is a theory in a first-order language L)
Proof.
- Given that every finite subset of T has a non-empty model, WTS T has a model.
- The first step is to extend T to a Hintikka set T+. We will do so by extending the language L to a larger first-order language L+, and then later taking the L-reduct of the model of T+ to get a model of the original theory T in L.
- To extend
T
into a Hintikka set
T+,
we should satisfy the three requirements from the previous section:
- every finite subset of T+ has a model
- every sentence ϕ of L+ appears in T+ as either ϕ or ¬ϕ
- every sentence ∃x.ψ(x) in T+ implies ψ(t) in T+, for some closed term of t
- The plan for satisfying these requirements is twofold. The first step is extending L, as discussed. We add an infinite set of κ new unique constants ci to L, to be used as the closed terms t for the third condition above. To ensure there are enough constants for each ∃ in T+, κ must be at least the cardinality of L (so there’s at least one per term x∈L) plus the cardinality of T (so there’s at least one per sentence ψ∈T). The resulting language L+ has cardinality κ. Let ϕi represent the sentences of L+.
- The second step towards our Hintikka set is to define a chain of theories Ti in L+, where, starting with T0=T, each theory Ti+1 attempts to add one more sentence ϕi to the previous theory Ti in a way that continues to satisfy the three conditions above and is therefore a Hintikka set. Then the union of the chain Tκ=⋃Ti (essentially the “final” theory after all sentences ϕi have been added) is a Hintikka set. The precise mechanism of adding each ϕi will be described soon.
Then Tκ (the theory obtained after considering every sentence ϕi) will achieve the above three requirements of being a Hintikka set by ensuring
- every finite subset of each Ti has a model. Then Tκ satisfies the first requirement. We can ensure this if, when we try to add each ϕi, we only actually add ϕi if every finite subset of the resulting Ti+1 has a model. Otherwise, Ti+1=Ti.
- either ϕi or ϕj is in Tκ (where ϕj=¬ϕi).
- We can ensure this by tweaking the way we add ϕi. When we find that ϕi cannot be added since some finite subset Ui of Ti∪{ϕi} has no model (i.e. Ui∖{ϕi} can derive ¬ϕi), we’ll add ¬ϕi instead.
- In this second case we know that all finite subsets Uj⊆Ti∪{¬ϕi} have a model, because if some finite subset Uj doesn’t have a model, it means Uj∖{¬ϕi} can derive ϕi. And since Ui,Uj are subsets of Ti, their union Ui∪Uj must have a model, which is impossible if Ui derives ϕi and Uj derives {¬ϕi}.
- Therefore we’re still preserving (1), and since we’re either adding ϕi or ¬ϕi for every ϕi in the language, we ensure completeness by construction.
- Note: if we’ve already added ϕi in the form ¬ϕj for some ϕj, we’ll just skip adding ϕi (so Ti+1=Ti).
- every ∃x.ψ(x) in Tκ implies ψ(t) in Tκ for some t in L+
- This is the reason we extended L to L+ – the constants ci we add will be used for t.
- We’ll again modify the mechanism of adding each ϕi: whenever we add some ϕi that happens to be in the form ∃x.ψ(x), we additionally add ψ(ci) where ci is the constant added for ϕi and therefore has not been used up to now. Note that it’s not possible that ϕi = some ¬ϕj we added earlier since we know ϕi is in the form ∃x.ψ(x). This trivially completes the proof of (3) since we explicitly made it so ψ(t) exists for every ∃x.ψ(x).
- However, by modifying the mechanism of adding each ϕi, we will have to prove (1) and (2) again. We will only prove (1), since (1) implies (2) using the same argument before.
- (1 again) every finite subset of our new
Ti
(= old
Ti
with
ψ(ci)
added) has a model.
- In the case where ϕi is not in the form ∃x.ψ(x), we don’t add ψ(ci), so by the old argument (1) still holds.
- In the case where ϕi is in the form ∃x.ψ(x), we add ψ(ci) as well. This changes things – we must show that the new Ti+1=Ti∪{∃x.ψ(x),ψ(ci)} has a model.
- From the old argument, we know that every arbitrary subset of the resulting Ti+1=Ti∪{∃x.ψ(x)} has a model A.
- Let U be one of those subsets that contains ∃x.ψ(x), which has a model A⊨U.
- The plan is to define a new L+-structure B modelling the same theory U such that B⊨ψ(a) means the same thing as B⊨ψ(ci), thus showing that every subset containing ψ(ci) of our new Ti+1=Ti∪{∃x.ψ(x),ψ(ci)} has a model B.
- Since A⊨∃x.ψ(x), there is some element a∈A such that A⊨ψ(a). We can define B as identical to A, except it interprets the constant symbol ci as a. In other words, B⊨ψ(a) means the same thing as B⊨ψ(ci). We ensured that ci is a fresh variable since this is the first time we’re considering adding the corresponding ϕi. So any previous interpretation of ci is irrelevant, making B still a model of U
- Since A⊨ψ(a) by definition of A, we have B⊨ψ(a), which means the same thing as B⊨ψ(ci), thus showing that every subset containing ψ(ci) of our new Ti+1=Ti∪{∃x.ψ(x),ψ(ci)} has a model B.
Finally, we’ve constructed some Tκ that is a Hintikka set of L+ by virtue of satsifying the three conditions from the previous section. Therefore it has a model (the canonical model of its atomic sentences). Then Tκ includes T by its construction (a chain on top of T), so we can take the L-reduct of the model of Tκ to get a model of the original theory T in L.
There are many many proofs of the compactness theorem, and this is just one of them, called the Henkin construction. For more proofs of the compactness theorem, see: https://mathoverflow.net/a/45501
The converse of the compactness theorem is quite trivial. If a theory T has a model M, then M models every finite subset of T since they are subsets of T.
“T is satisfiable” is another way to say “T has a model”. So the compactness theorem can also be stated “T is satisfiable if and only if T is finitely satisfiable.”
In the end we’ve found the minimum requirement for T to have a canonical model – we just need to prove that every finite subset of T has a model, which is often done by construction.
-
November 3, 2023.
Exploration 2: Types
The goal of this exploration is to show how properties are expressed in model theory, and how we can derive the compactness theorem using such properties.
Currently we consider models to be some unordered jumble of elements that give meaning to constants, functions, and relations in a language L.
But sometimes it’s useful to talk about a specific subset of those elements. For instance, if we have a model of the theory of integers, we might want to single out the elements of the model that signify prime integers.
Typically we do this by defining some formulas that fully describe said elements. In the prime number example, we might have ϕ(p)≡¬∃xy.x≥2∧y≥2∧x×y=p “a number is prime if there is no product of two integers ≥2 equal to it”. Then every element a in our model that satisfies ϕ(p) are the prime numbers.
In general, given A⊨T we can write a set of formulas tpA(xˉ) consistent with the given theory T, called a type over T. This type singles out some subset of elements aˉ⊆A by saying that aˉ realizes tpA.
Recall that theories are made up of sentences, which are formulas without parameters. Intuitively, a type over T is a parameterized theory, with the additional condition that it must be consistent with some existing theory T (in the above case, the theory of integers). Types can also be unparameterized (if the set of parameters is empty) but typically they are parameterized.
In this section, we show how to compare two types.
Every element of a model has its own type. If we consider a single element a of a model A of some theory T, we can define the type of a over T as every formula ϕ(x) (with one free variable x) where ϕ(a) is true in A. This type is guaranteed to include a, so every element of a model has its own type tp(a).
Sometimes a type is a superset of another type. For example, anything that is true about the set of prime numbers is true about the prime integer 5 in particular. That means the set of formulas true about the integer 5, tp(5), contains the set of formulas true about the prime numbers, tpprime. We say tp(5) is stronger than tpprime, or tpprime is weaker than tp(5), because tpprime⊆tp(5). So we have a partial ordering of types based on inclusion.
By definition of tp(5), there aren’t any more formulas true about 5 that aren’t already in tp(5). Much like theories can be complete (having ϕ or ¬ϕ for every sentence ϕ in the language), tp(5) is a complete type (having ϕ(xˉ) or ¬ϕ(xˉ) for every formula ϕ(xˉ) in the language). Complete types are by definition the strongest types. We can’t add any more formulas ϕ(xˉ) to it without making it inconsistent with the given theory.
In this section, we define the model of a theory by its strongest types.
Now let’s model the theory of integers with this concept of strongest types (complete types).
Getting the strongest types
The first step is to find all of the strongest types. Let’s start from an example type, like tpprime, which encodes all facts about the prime numbers. Given such a type, we’d like to get all types stronger than it, such as tp(5). Define a set containing tpprime as well as every type stronger than tpprime. We call this set a principal filter over tp. In our case, the principal filter on tpprime includes tpprime, tp(5), tp(3), in fact all the tp(prime number), and in fact every type that singles out any subset of the prime numbers.
The principal filter on complete types like tp(5) contains only itself, {tp(5)}. When a non-empty filter, like {tp(5)}, can’t be extended with additional stronger types without making two types inconsistent with each other, we call it an ultrafilter, meaning it’s a filter that already contains all the strongest types.
We care about ultrafilters because while a filter might contain types that conflict (are inconsistent with each other), an ultrafilter necessarily cannot have conflicting types. By taking all the ultrafilters of the theory, we essentially have access to all the strongest types that don’t conflict with each other.
Showing that there is an ultrafilter for every filter
Since the set of all types is typically infinite, to take all the ultrafilters in the space of types requires applying the Axiom of Choice (in the form of Zorn’s Lemma).
To this end we prove the Ultrafilter lemma: every filter F is contained in some ultrafilter. Proof:
- Form a partial ordering M of filters that contain F.
- Zorn’s lemma: M has a maximal element (= our ultrafilter) if we can prove that every non-empty chain in M has an upper bound in M.
- Take an arbitrary non-empty chain in M.
- Every union of a chain of filters is a filter. So take the union of this chain to get a filter U. U contains every filter in the chain, and therefore is an upper bound of this chain.
- To show U is in M, we just need to show that U contains F, since every filter containing F is in M. But U is the union of some filters from M that contain F by definition, which means U contains F too.
- By Zorn’s Lemma, there is a maximal filter in M, the ultrafilter containing F.
This lemma shows that we can extend any filter to an ultrafilter.
Getting every ultrafilter via finite subsets of the theory
Next, we require that every finite subset of T to have a model. Sound familiar? It’s the same requirement as the Compactness Theorem, which we are essentially proving by constructing a model of an arbitrary theory.
Now create a filter F containing each finite subset of T as a type. Extend this filter with every type that is stronger than some type already in F, so that the resulting filter F+ contains all the strongest types (complete types) consistent with T. This might add types that are inconsistent with each other, so we use the Ultrafilter lemma to take the ultrafilter U of F+ to ensure that all types within U agree on every formula ϕ in the language.
Defining a model via an ultrafilter of types
Then, every complete 1-type (single-parameter type) p(x) in the ultrafilter U defines some element ap in our model A, so that ap realizes p(x) in our model.
Since our domain are elements that correspond to a type, we can identify every term in the language with a type in order to construct our model. In particular, we identify every term t with the complete 1-type in U containing ϕ(x)≡x=t. The ultrafilter U ensures that we can’t have two complete 1-types that disagree on what element t is equal to, so this is well-defined.
This gives an interpretation to all the constant symbols (when t is some constant c) and implicitly requires us to lift all the function symbols to operate on our elements ap. This is well-defined, because our complete types in U define the outcome of every possible f(xˉ) via formulas like ϕ(y)≡y=f(xˉ). We lift all the relation symbols similarly.
Because this construction is based on complete types consistent with T, the resulting elements ap are defined so that every sentence in T is true.
In the end we have essentially proved the compactness theorem again — we constructed a model for T directly by treating finite subsets of T as types.
-
November 6, 2023.
Exploration 3: Homomorphism preservation theorems
The goal of this exploration is to show which formulas are preserved by which homomorphisms.
Each section introduces a different kind of formula, and for each we show
- a certain class of homomorphisms preserves this kind of formula
- the homomorphisms that preserve this kind of formula are exactly this class of homomorphisms
and therefore, homomorphisms can be classified by the kinds of formulas they preserve.
In this section, we show that homomorphisms are the maps that preserve atomic formulas.
Recall that homomorphisms are maps between L-structures preserving a structure’s constant elements, function symbols, and relation symbols.
Homomorphisms are equivalently defined as those which preserve atomic formulas ϕ(xˉ). That is, when A⊨ϕ(aˉ), the homomorphisms are those maps f:A→B for which B⊨ϕ(faˉ).
This equivalence can be proved by a sequence of equivalences:
⟺⟺⟺⟺⟺f is a homomorphismtˉ(aˉ)∈RA implies ftˉ(aˉ)∈RBtˉ(aˉ)∈RA implies tˉ(faˉ)∈RBA⊨R(tˉ(aˉ)) implies B⊨R(tˉ(faˉ))A⊨ϕ(aˉ) implies B⊨ϕ(faˉ)f preserves atomic formulas(f preserves relations between terms t(xˉ) in L)(preserving relations also means ftˉ(aˉ)=tˉ(faˉ))tˉ∈RA,by definition of⊨,means A⊨R(tˉ)R(tˉ(xˉ)) defines an arbitrary atomic formula ϕ(xˉ)definition of preserving atomic formulas ϕ(xˉ)
where aˉ is an arbitrary sequence of elements of A. The step “Preserving relations means preserving functions” requires some explaining – the idea is that if you preserve the relation G(x,y,z)⟹G(fx,fy,fz), then you preserve a function g(x,y)=z⟹g(fx,fy)=fz and vice versa.
So we’ve shown that if f is a homomorphism, it preserves atomic formulas.
An embedding is a homomorphism where relations are preserved in both directions; that is, aˉ∈RA⟺faˉ∈RB. Then you may replace “implies” with “iff” in each line of the proof above to show a corollary: that the embeddings are precisely the maps f:A→B where A⊨ϕ(aˉ) iff B⊨ϕ(faˉ) for all atomic formulas ϕ. (Theorem 1.3.1c)
In this section, we define precisely what is meant by A⊨ϕ (for all first-order formulas in L∞ω).
Previously we defined A⊨ϕ only for atomic formulas ϕ, since that’s all we needed to define canonical models. (We used Hintikka sets to help us define models for non-atomic formulas, but the proof did not require the definition of A⊨ϕ for non-atomic ϕ.)
Now we will be diving into such non-atomic formulas ϕ in L∞ω, and as such we require a precise definition of ⊨ on non-atomic formulas. Here is a definition of ⊨ by induction on first-order formulas ϕ, for which there are six cases:
- A⊨ϕ(aˉ) iff (t1A(aˉ),…,tnA(aˉ))∈RA. (as defined earlier)
- A⊨¬ψ(aˉ) iff A⊨ψ(aˉ).
- A⊨⋀Ψ(aˉ) iff A⊨ψ(aˉ) for every formula ψ∈Ψ.
- A⊨⋁Ψ(aˉ) iff A⊨ψ(aˉ) for some formula ψ∈Ψ.
- A⊨∀y.ψ(y,aˉ) iff A⊨ψ(x,aˉ) for all x∈A.
- A⊨∃y.ψ(y,aˉ) iff A⊨ψ(x,aˉ) for some x∈A.
In this section, we show that embeddings are the maps that preserve existential formulas.
Not only do homomorphisms preserve atomic formulas, they preserve quantifier-free ones too.
Quantifier-free formulas (atomic formulas composed with just ∨, ∧, and ¬) are preserved by homomorphisms. This amounts to proving A⊨ϕ(aˉ)⟺B⊨ϕ(faˉ), which is done by induction on quantifier-free ϕ (4 cases).
Proof: by induction on ϕ.
- ϕ(aˉ) is atomic: all homomorphisms preserve atomic formulas, as discussed previously.
- The remaining cases are true because of the IH: A⊨ψ(aˉ)⟺B⊨ψ(faˉ).
- ϕ(aˉ) is ¬ψ(aˉ): then A⊨¬ψ(aˉ)⟺B⊨¬ψ(faˉ), which is true by IH.
- ϕ(aˉ) is ⋀Ψ(aˉ): then A⊨∧ψ(aˉ)⟺B⊨∧ψ(faˉ), which is true by IH.
- ϕ(aˉ) is ⋁Ψ(aˉ): then A⊨∨ψ(aˉ)⟺B⊨∨ψ(faˉ), which is true by IH.
In this section, we show existential formulas are preserved by embeddings and that universal formulas are preserved in substructures.
While all homomorphisms preserve quantifier-free formulas, certain homomorphisms will preserve quantified formulas ϕ as well.
Recall we proved that the embeddings are precisely the maps f:A→B where A⊨ϕ(aˉ) iff B⊨ϕ(faˉ) for all atomic formulas ϕ; and that they are the homomorphisms that preserve relations in both directions, that is, aˉ∈RA⟺faˉ∈RB.
For an embedding f, we’ve already proved f preserves ϕ, i.e. A⊨ϕ(aˉ)⟺B⊨ϕ(faˉ), in the case where ϕ is atomic. In the previous section, we saw that a homomorphism preserving atomic ϕ also preserves quantifier-free ϕ. We will quickly show that embeddings in particular preserve ∃1-formulas (existential formulas) – quantifier-free formulas ϕ(aˉ) with any number of ∃ in front (so they look like ∃x0∃x1∃x2ψ where ψ is quantifier-free).
Theorem: Embeddings f:A→B preserve ∃1-formulas.
Proof: by induction on the ∃1-formula ϕ(aˉ).
- If ϕ is quantifier-free: already proved above.
- If ϕ is ∃y.ψ(y,aˉ), we have a chain of equivalences:
⟺⟺⟺⟺⟺A⊨ϕ(aˉ)A⊨∃y.ψ(y,aˉ)A⊨ψ(c,aˉ) for some c in AB⊨ψ(fc,faˉ)B⊨∃z.ψ(z,faˉ)B⊨ϕ(faˉ)since ϕ is ∃ with a quantifier-free part ψby definition of ⊨the embedding f preserves quantifier-free ψ in both directionsby definition of ⊨since ϕ is ∃ with a quantifier-free part ψ
So indeed, embeddings preserve some formulas with quantifiers – ∃1-formulas. In fact, since the proof works in reverse, we find that embeddings are the homomorphisms that preserve $_1-formulas.
What about the ∀1-formulas? Just like ∃1-formulas, ∀1-formulas (universal formulas) are quantifier-free formulas ϕ(aˉ) with any number of ∀ in front (so they look like ∀x0∀x1∀x2ψ where ψ is quantifier-free). There is a dual theorem regarding ∀1-formulas, and the proof relies on the fact that ∀ is logically equivalent to ¬∃¬.
Theorem: Substructures preserve ∀1-formulas.
Proof:
- We can represent any ∀1-formula by ¬ϕ, since ϕ is a ∃1-formula.
- WTS B⊨¬ϕ(aˉ)⟺A⊨¬ϕ(faˉ), given that A is a substructure of B.
- Whenever A is a substructure of B, we have an embedding f:A→B (the inclusion map).
- Since ϕ is a ∃1-formula, it’s preserved by embeddings: A⊨ϕ(aˉ)⟺B⊨ϕ(faˉ).
- The contrapositive is ¬(B⊨ϕ(aˉ))⟺¬(A⊨ϕ(faˉ)).
- By definition of ⊨, we get B⊨¬ϕ(aˉ)⟺A⊨¬ϕ(faˉ).
- So substructures preserve ∀1-formulas ¬ϕ.
This proves that substructuring (the “reverse” of an embedding) preserves ∀1-formulas. Since the proof works in reverse as well, we see that substructuring can be defined as the subset operation that preserves ∀1-formulas.
In summary, we have shown that embeddings are the homomorphisms that preserve ∃1-formulas, and restricting into a substructure preserves ∀1-formulas.
In this section, we show positive formulas are the ones preserved by surjective homomorphisms.
A positive formula is one in which has no remaining ¬, after simplifying it down to ¬,∨,∧,∃,∀ and pushing all the ¬ inside quantifiers.
After seeing the previous preservation theorems, you might guess that there is a certain kind of homomorphism that preserves positive formulas:
Theorem: Surjective homomorphisms f:A→B preserve positive formulas ϕ.
Proof: by induction on positive formulas ϕ.
- If ϕ is quantifier-free, then we can apply the IH on each subformula. Then ϕ is trivially preserved.
- If ϕ is ∃y.ψ(y,aˉ) for some positive formula ψ, then assume that A⊨ϕ(aˉ). By definition of ⊨, this means there is some c such that A models the positive formula ψ(c,aˉ). By IH, f preserves this positive formula and therefore B⊨ψ(fc,faˉ). Since f is surjective, we know that fc∈B, and therefore B⊨∃y.ψ(y,faˉ), i.e. B⊨ϕ(faˉ), proving that f preserves ϕ.
- If ϕ is ∀y.ψ(y,aˉ) for some positive formula ψ, we proceed by a similar argument. By definition of ⊨ we have A⊨ψ(c,aˉ) for every c∈A. By IH, f preserves this positive formula and therefore B⊨ψ(fc,faˉ). Since f is surjective, every fc∈B, and therefore B⊨∀y.ψ(y,faˉ), i.e. B⊨ϕ(faˉ), proving that f preserves ϕ.
By proving the converse, we get Lyndon’s Positivity Theorem: the homomorphisms that preserve positive formulas are the surjections. The proof of the converse typically goes like this:
Theorem: If a homomorphism f:A→B preserves positive formulas ϕ, it is surjective.
Proof:
- If f preserves positive formulas, it preserves the positive formula ϕ(y)≡∃x.x=y.
- If B⊨ϕ(b) for some arbitrary b∈B, which it does since b=b for all b. Since f preserves the positive formula ϕ(b), we must have A⊨ϕ(a) for the preimage a∈A, which is again true since a=a.
- This establishes that the preimage a exists in A for an arbitrary element b∈B. Therefore, f is surjective.
So we have proved that surjections are exactly the homomorphisms that preserve positive formulas.
In this section, we pinpoint the kind of formulas that are preserved by all homomorphisms.
Every formula is preserved by an isomorphism, but we’ve also explored above that certain homomorphisms preserve certain formulas. To summarize, we have learned
- The Łoś-Tarski preservation theorem:
- ∀1-formulas are preserved by substructures.
- ∃1-formulas are preserved by embeddings.
- Lyndon’s positivity theorem:
- Positive formulas are preserved by surjections.
Meanwhile, we’ve built up the class of formulas that are preserved for all homomorphisms. We first proved that atomic formulas are preserved by all homomorphisms. Then, we used that to show all quantifier-free formulas are preserved by all homomorphisms.
Finally, we can combine the last two results in the previous list; ∃1+-formulas (positive existential formulas) are preserved by both surjective homomorphisms and embeddings, and so they must be preserved by all homomorphisms. This relies on the fact that every homomorphism f:A→B can be thought of as a surjection A→im(f) combined with an inclusion map embedding im(f)→B. Conversely, if your homomorphism f preserves an arbitrary formula, it must be a ∃1+-formula since the surjection part of f can only preserve positive formulas and its inclusion part of f can only preserve existential formulas.
-
November 7, 2023.
Exploration 4: Diagrams
The goal of this exploration is to show that all properties of a model M can be captured by a certain theory (its diagram). In particular, two models are isomorphic if they have the same diagram.
In this section, we show that every embedding can be written equivalently as a model. (Diagram lemma)
Recall that embeddings are precisely maps which preserve existential formulas: A⊨ϕ(aˉ) iff B⊨ϕ(faˉ) for all ∃1-formulas ϕ.
How do we construct an embedding f:A→B? It would mean finding a name in B for every element in A, such that every existential formula ϕ is preserved. This seems hard to verify and construct! With all the tools we have in model theory, there must be an easier way.
We’ll generalize the above procedure. We name every element of A by extending the language with a constant ca for every element a∈A. We define the diagram of A, diag(A), as the theory where for every atomic sentence ϕ(aˉ) or its negation ¬ϕ(aˉ) true in A for elements aˉ, diag(A) contains the sentence ϕ(cˉa) or ¬ϕ(cˉa) for the corresponding cˉa.
Note that by construction, every finite subset of diag(A) is modelled by A. So by compactness, diag(A) has a model. Let B be any model of diag(A). Then there is an embedding f:A→B where each element of A is mapped to the interpretation of the corresponding ca in B. And f is an embedding because it preserves all atomic formulas, because B models all atomic sentences in diag(A) and therefore all atomic formulas.
This means that A embeds into any model of diag(A). Conversely, every embedding of A is a model of diag(A)!
This is called the diagram lemma, which states that the following are equivalent:
- B ⊨ diag(A).
- A (uniquely up to isomorphism) embeds into B.
- f : A → B is an embedding.
Usually the first criterion is written in an even weaker form:
- B can be expanded to a model of diag(A).
This is because we could just use the diagram lemma above to obtain the unique embedding f : A -> B’, and then restrict the codomain to B by taking the reduct of the expansion B’. The reverse direction is trivial; if A embeds into B then we know that B models diag(A), and any expansion of B must also model diag(A).
<-──────-><-─────–>
In this section, we show that every elementary embedding can be written equivalently as a model. (Elementary diagram lemma)
There is a simple extension of the diagram lemma to elementary embeddings (embeddings that preserve all first-order sentences). Take the elementary diagram of A, eldiag(A), as the theory where for every first-order sentence ϕ(a̅) true in A for elements a̅, eldiag(A) contains the sentence ϕ(c̅ₐ) for the corresponding c̅ₐ.
By the same argument in the previous section, A elementarily embeds into any model of diag(A) via the unique (up to isomorphism) mapping of a to the model’s interpretation of cₐ.
This is the elementary diagram lemma, stating the following are equivalent:
- B can be expanded to a model of eldiag(A).
- A (uniquely) elementarily embeds into B.
- f : A → B is an elementary embedding.
We can even extend this to a theorem showing when two models are elementarily equivalent (i.e. model the same first-order sentences). One way to show two models are elementarily equivalent is to show that they elementarily embed into each other. By the elementary diagram lemma above, we have:
- A ≡ B when:
- A can expand to a model of eldiag(B), and
- B can expand to a model of eldiag(A).
The criterion can be made even simpler. Since we always have A ⊨ eldiag(A) and B ⊨ eldiag(B), we only need eldiag(A) ⊆ eldiag(B), and vice versa. Therefore,
- A ≡ B when eldiag(A) = eldiag(B).
That is, if A and B share the same elementary diagram, then they are elementarily equivalent!
Now we’ll show the converse: when two models are elementarily equivalent, they share the same elementary diagram.
- If A ≡ B, then any sentence ϕ in L is either in eldiag(A) or not.
- If ϕ ∈ eldiag(A) then A ⊨ ϕ.
- Since A ≡ B, B ⊨ ϕ, and therefore ϕ ∈ eldiag(B).
- So eldiag(A) ⊆ eldiag(B).
- The same argument can be used in the other direction, so eldiag(B) ⊆ eldiag(A).
- Therefore the two elementary diagrams are the same.
So in the end we have:
- A ≡ B iff eldiag(A) = eldiag(B).