Exploration 1: The canonical model of a theory
November 2, 2023.The goal of this exploration is to show what it takes for a theory to have a model.
Each section will prove that has a model for increasingly wider classes of , starting with only containing atomic sentences, and ending with containing first-order sentences.
In this section, we construct the canonical model for a set of atomic sentences .
For now, we will only discuss theories where is a set of atomic sentences (atomic formulas with no variables ). We’ll prove that there is always a model for , the canonical model of .
Outline:
- First, we simplify the problem to finding a model of when is “-closed”.
- Later, we define how to make -closed by extending to its “-closure”.
- The canonical model of is the model of the -closure of , which also models the subset .
Let’s start. We’d like to extend our theory to give it some properties that make it easier to find a model for it. Here are two such properties:
- (reflexivity) is in , for all closed terms of , and
- (congruence) in implies iff , for all closed terms of and atomic formulas (x) of .
Intuitively, (congruence) says if two terms are equal, then you should be able to swap them in any sentence in and the result will still be in .
If a set of atomic sentences has both (reflexivity) and (congruence), then it is -closed (in ). So we’ll go ahead and extend our set of atomic sentences to its -closure, i.e. its minimal -closed extension.
Now let’s use these properties to find a model for .
Building the canonical model
First of all, what is the underlying set? By definition of , we know that to model equalities in , we must interpret both sides of the equality to be the same. An easy way to do this is to have the elements of be equivalence classes where puts and in the same equivalence class.
We prove is an equivalence relation on . By (reflexivity) we already have reflexivity. By (congruence) if is in then we can swap the terms to get also in , so we have symmetry, and by a similar argument we have transitivity. So is a valid equivalence relation.
Therefore, the underlying set for our model shall be equivalence classes of closed terms, where and are equivalent if .
Since terms are now equivalence classes, our constant symbols must also be equivalence classes, and our function symbols and relation symbols all must operate on equivalence classes. We do this by lifting the relations in to take equivalence classes instead of terms.
Let’s show that this definition preserves the truth of relations in . This requires showing that in , whenever and are terms in the same equivalence class (of equality in T), is the same as . But this is true because of (congruence). More generally, we must show for all length- sequences of closed terms and where corresponding and are in the same equivalence class. But this is true by applying (congruence) times. Thus, lifting the relation symbols from to apply to equivalence classes is well-defined, and gives us a set of relation symbols for .
What about the constants and function symbols of ? The constant elements and function symbols of can be lifted from into by having them return equivalence classes of terms rather than single terms. We again apply (congruence) recursively to show and are in the same equivalence class, and so this is well-defined.
Summary
In summary, by defining this -structure on the -closure of , we’ve built the canonical model of that, by construction, models our given set of atomic sentences .
In this section, we prove that every model of is isomorphic to the canonical model, assuming is composed of atomic sentences.
First we define homomorphisms between -structures. The idea of a homomorphism is that it set every symbol in an -structure to a corresponding symbol in another -structure (symbols may overlap).
An isomorphism is an invertible homomorphism. This necessarily means each unique symbol in maps to a corresponding unique symbol in , i.e. it’s a renaming.
We’d like to prove that the canonical model is unique – that any two canonical models , are equal. But models generally aren’t unique since you can always swap around the names of its elements to get an equivalent model. So we prove instead that the canonical model is unique up to isomorphism – any two canonical models are renamings of each other.
Suppose that is the canonical model for and is some other arbitrary model of . Define the homomorphism that maps equivalence classes to the element .
- We show that
is a homomorphism:
- (congruence) means all constants and functions are preserved under .
- and both being models of means all relations are preserved under .
- Therefore the interpretation of each symbol in is preserved under .
- To show that is well-defined, we need to show that whenever are in the same equivalence class, must interpret and as the same element (). But this is true because being in the same equivalence class means , so implies implies by definition of .
- The reverse shows that is injective: whenever , must be true by (reflexivity), and therefore are in the same equivalence class.
- is also surjective since every element is an interpretation of some term (since ), and therefore is mapped to by .
Therefore is an isomorphism. Since we’ve constructed an isomorphism from the canonical model to an arbitrary model of , this means every model of is isomorphic to the canonical model, i.e. the canonical model is unique up to isomorphism.
In this section, we show certain first-order theories are also modelled by the canonical model.
The language we work with is often a subset of (“infinitary logic”), an extension of first-order logic. The first symbol means that the maximum size of conjunctions and disjunctions in formulas is infinite. The second symbol means that the maximum number of quantifiers can be large but finite.
By taking all sentences in true for a structure , we obtain its first-order theory . So every structure has a first-order theory. Conversely, does every first-order theory have a model, like we showed before for atomic sentence theories? The answer turns out to be no, not every first-order theory has a model.
But by extending some atomic sentence theory with certain first-order sentences, it’s possible to still have the same canonical model be a model of the extended , as long as the resulting first-order theory maintains the following consistency properties:
- Consistency: implies for atomic sentences
- Congruence: iff for atomic formulas and closed terms of , whenever
- Reflexivity: for every closed term of
- Double negation: implies
- Closure under : implies
- Its inverse: implies
- Closure under : implies
- Its inverse: implies
- Closure under : implies for every closed term of
- Its inverse: implies for some closed term of
- Closure under : implies for some closed term of
- Its inverse: implies for every closed term of
The motivation for these properties: #1 and #4 encode negation, #2 and #3 encode -closure, #5/#6 and #7/#8 encode and , #9/#10 and #11/#12 encode and . These properties are known as consistency properties of , because they ensure that is consistent even under the various properties of our logical connectives. Such theories satisfying consistency properties are called Hintikka sets (for ).
The 8th property ( implies ) is more commonly known as the Henkin property, which is special for reasons we’ll get to soon.
But why do we care about this specific assortment of properties? And why are these properties enough to encode everything we need about their respective operations?
- To answer the first question: recall that we want the canonical model to fit , and so we need to have properties on that make sure that the canonical model remains a model of after adding non-atomic sentences. Non-atomic sentences in the language are precisely atomic sentences that are bound together by the logical connectives . So we need to separately ensure that the properties of are held in the -closure of , recalling that the -closure of is the actual theory that we construct the canonical model on; see the previous section on the canonical model for atomic sentences.
- To answer the second question: recall that for the canonical model to model , we need to ensure that every sentence in is in . When had only atomic sentences (i.e. single relations), it was easy to map each to the corresponding relation in . But now that has non-atomic sentences, we need to ensure that what we know in first-order logic holds in as well. For example, if models and contains and , we know logically that by definition of , but that’s not necessarily true in unless we make it true by adding the corresponding property. By encoding the axioms of our logical connectives in this manner, we arrive at the consistency properties above. (The two properties for -closedness, #2 and #3, correspond to the fact that must be a -closure, as mentioned in the previous bulletpoint.)
Hopefully that clears up the motivation for Hintikka sets; now we can say:
(Theorem 2.3.3) If is a Hintikka set for , the canonical model for (atomic sentences in ) models .
Theorem 2.3.3 implies that any theory that can be extended to a Hintikka set has a model (the canonical model). “Just extend it to a Hintikka set” is a lot to ask since there’s so many properties to fulfill. Luckily, all the properties above are true if the following smaller set of properties are true:
(Theorem 2.3.4) If
- (finite satisfiability) every finite subset of has a model,
- (completeness) or is in but not both, for every sentence in
- (Henkin property): every in implies in for some closed term of
…then is a Hintikka set for .
If a theory can be extended to get all three properties, then the extension is a Hintikka set, and therefore has a model. Proof that these three properties imply a Hintikka set:
Two of the consistency properties follow immediately:
- (Completeness) implies (#1) consistency: implies for atomic sentences .
- The (Henkin property) is exactly (#8) closure under : implies for some closed term .
To derive the remaining consistency properties quickly, we first prove three short lemmas:
- (finite consistency) Every finite subset of
is consistent (no contradictions).
- Proof: Every finite subset of has a model (finite satisfiability), and inconsistent theories have no model, so every finite subset of cannot be inconsistent.
- (finite strong completeness) Every sentence logically
derivable
()
from a finite subset of
is in
.
- Proof: Let be a finite subset of . If , then is finite but inconsistent, and therefore fails to be a subset of (finite consistency). This means , which by (completeness) implies .
- (finite strong completeness corollary): if
,
then
.
- That is, if follows from via deductions on logical operators, then we know that the property holds on .
- Proof: if then is a finite subset of , so immediately follows from (finite strong completeness).
- This trivially extends for multiple hypotheses, , as well as none, .
Since the majority of our consistency properties are in the form where , we can use the (finite strong completeness corollary) to immediately prove them.
-
proves reflexivity:
for every closed term
.
- (Reflexivity is an axiom in our language, so we can always derive it from nothing.)
- (for every sentence ) proves double negation: implies .
- (for every in ) proves closure under : implies .
- (for every in ) proves inverse of closure under : implies .
- (for every closed term ) proves closure under : implies for every closed term .
- (for every closed term ) proves inverse of closure under : implies for every closed term .
- proves . Then, using the (Henkin property) on , we get inverse of closure under : implies for some closed term .
- and (for every atomic formula ) prove both sides of congruence: iff for atomic formulas and closed terms , whenever .
We’ve proved 10 out of 12 properties; the remaining ones are closure under and the inverse of closure under . They both require showing some exists in without specifying which . Since we can’t derive () this unknown , we can’t use (finite strong completeness) here. Instead, we’ll prove by contradiction – not having the property lets us construct an inconsistent finite subset of , violating (finite consistency).
- Closure under
:
implies
.
- Towards contradiction, assume but . Using (completeness), this implies .
- is finite since we only allow finite conjunction/disjunction in first-order logic.
- But then has an inconsistent finite subset , which contradicts (finite consistency). This finishes the proof.
- Inverse of closure under
:
implies
.
- Let denote “ but with prepended to each ”.
- proves implies , by way of (finite strong completeness).
- Then, closure under gives , as required.
Theorem 2.3.4 tells you a set of minimum requirements to make an extension of into a Hintikka set: every finite subset must have a model, it must be a complete theory, and it must be closed under (Henkin property). If you can extend with these properties, you have a model for . We will use this in the next section.
In this section, we show the absolute minimum requirement for a first-order theory to have a canonical model. (Compactness theorem)
Previously, we found that not all first-order theories have a model, but certain extensions of atomic sentences as first-order theories do have a model (the canonical model) if they can be extended to get a Hintikka set: specifically, if (1) every finite subset of has a model, and is (2) complete and (3) closed under , then it has a model.
The following theorem shows that (2) and (3) are unnecessary; (1) alone is enough to find a model for first-order , again by taking the canonical model of a larger (cleverly-crafted) theory. This fundamental result of model theory is known as the compactness theorem for first-order logic, and is the more general way to find when a given first-order theory has a model.
The compactness theorem: If every finite subset of has a model, then has a model. (where is a theory in a first-order language )
Proof.
- Given that every finite subset of has a non-empty model, WTS has a model.
- The first step is to extend to a Hintikka set . We will do so by extending the language to a larger first-order language , and then later taking the -reduct of the model of to get a model of the original theory in .
- To extend
into a Hintikka set
,
we should satisfy the three requirements from the previous section:
- every finite subset of has a model
- every sentence of appears in as either or
- every sentence in implies in , for some closed term of
- The plan for satisfying these requirements is twofold. The first step is extending , as discussed. We add an infinite set of new unique constants to , to be used as the closed terms for the third condition above. To ensure there are enough constants for each in , must be at least the cardinality of (so there’s at least one per term ) plus the cardinality of (so there’s at least one per sentence ). The resulting language has cardinality . Let represent the sentences of .
- The second step towards our Hintikka set is to define a chain of theories in , where, starting with , each theory attempts to add one more sentence to the previous theory in a way that continues to satisfy the three conditions above and is therefore a Hintikka set. Then the union of the chain (essentially the “final” theory after all sentences have been added) is a Hintikka set. The precise mechanism of adding each will be described soon.
Then (the theory obtained after considering every sentence ) will achieve the above three requirements of being a Hintikka set by ensuring
- every finite subset of each has a model. Then satisfies the first requirement. We can ensure this if, when we try to add each , we only actually add if every finite subset of the resulting has a model. Otherwise, .
- either or is in (where ).
- We can ensure this by tweaking the way we add . When we find that cannot be added since some finite subset of has no model (i.e. can derive ), we’ll add instead.
- In this second case we know that all finite subsets have a model, because if some finite subset doesn’t have a model, it means can derive . And since are subsets of , their union must have a model, which is impossible if derives and derives .
- Therefore we’re still preserving (1), and since we’re either adding or for every in the language, we ensure completeness by construction.
- Note: if we’ve already added in the form for some , we’ll just skip adding (so ).
- every in implies in for some in
- This is the reason we extended to – the constants we add will be used for .
- We’ll again modify the mechanism of adding each : whenever we add some that happens to be in the form , we additionally add where is the constant added for and therefore has not been used up to now. Note that it’s not possible that = some we added earlier since we know is in the form . This trivially completes the proof of (3) since we explicitly made it so exists for every .
- However, by modifying the mechanism of adding each , we will have to prove (1) and (2) again. We will only prove (1), since (1) implies (2) using the same argument before.
- (1 again) every finite subset of our new
(= old
with
added) has a model.
- In the case where is not in the form , we don’t add , so by the old argument (1) still holds.
- In the case where is in the form , we add as well. This changes things – we must show that the new has a model.
- From the old argument, we know that every arbitrary subset of the resulting has a model .
- Let be one of those subsets that contains , which has a model .
- The plan is to define a new -structure modelling the same theory such that means the same thing as , thus showing that every subset containing of our new has a model .
- Since , there is some element such that . We can define as identical to , except it interprets the constant symbol as . In other words, means the same thing as . We ensured that is a fresh variable since this is the first time we’re considering adding the corresponding . So any previous interpretation of is irrelevant, making still a model of
- Since by definition of , we have , which means the same thing as , thus showing that every subset containing of our new has a model .
Finally, we’ve constructed some that is a Hintikka set of by virtue of satsifying the three conditions from the previous section. Therefore it has a model (the canonical model of its atomic sentences). Then includes by its construction (a chain on top of ), so we can take the -reduct of the model of to get a model of the original theory in .
There are many many proofs of the compactness theorem, and this is just one of them, called the Henkin construction. For more proofs of the compactness theorem, see: https://mathoverflow.net/a/45501
The converse of the compactness theorem is quite trivial. If a theory has a model , then models every finite subset of since they are subsets of .
“ is satisfiable” is another way to say “ has a model”. So the compactness theorem can also be stated “ is satisfiable if and only if is finitely satisfiable.”
In the end we’ve found the minimum requirement for to have a canonical model – we just need to prove that every finite subset of has a model, which is often done by construction.
< Back to category Exploration 1: The canonical model of a theory (permalink)Intro to model theory