Before we start formalizing proofs about languages, we first need to better formalize what we mean by inference rules. (Winskell goes into much more detail than I do here.) Inference rules let us define sets in an inductive fashion. Last time, we defined our small-step transition relation, which is a set of pairs of configurations, using inference rules. We defined our abstract syntax for the language using BNF, but this is really short-hand for inference rules as well. For example, we can define the set of all expressions in IMP using inference rules as follows: i in Z (1) -------- i in Exp x in Id (2) -------- x in Exp e1 in Exp bop in Binop e2 in Exp (3) ----------------------------------- (e1 bop e2) in Exp The stuff above the line is called the antecedent, and the stuff below the line is called the conclusion. We read a rule like (3) as saying: If e1 is in Exp, bop is in Binop, and e2 is in Exp, then (e1 bop e2) is in Exp. Note that in these inference rules, we're using meta-level variables such as e, i, x, and bop. An *instance* of an inference rule is obtained by substituting object-level terms for these variables. For instance, we can subsitute 3 for i in rule 1 to get the instance: 3 in Z -------- 3 in Exp Typically, a rule generates a bunch of instances. A *derivation* D is a (finite) non-empty tree of instances of the rules. By finite, I mean that the tree has finite height and finite width. A *valid derivation* is one where the root of the tree is an instance of the form: A1 A2 .... An -------------- C and for each Ai, there is a valid sub-derivation whose conclusion is Ai. The leaves of the tree are thus axioms (instances of rules with no antecdents.) We write |- C (C is provable) to denote that there exists a valid derivation D with conclusion C. In our expression rules above, there are no axioms, so we need to add some way to derive that an integer is in Z and an identifier is in Id. Formally, I need to define Z and Id, just as I did for Exp. For instance, I could define Z as follows: n in PosNat ----------- n in Z n in PosNat ------------ neg(n) in Z --------- zero in Z ------------- one in PosNat n in PosNat ----------------- succ(n) in PosNat Usually, we don't go into this level of detail and instead, assume we have some pre-existing sets lying around that we can use to terminate the derivations. One more detail about the inference rules is crucial: When we say Exp is generated by the inference rules (1), (2), and (3), we mean that Exp is the *smallest* set of e's such that |- e in Exp. That is, the meaning of a set of inference rules is that it is the smallest set of C's such that |- C. In other words, if e in Exp, then it *has* to have been generated from one of the rules (1), (2), or (3) in a finite derivation. It's important to note that the smallest bit -- that's what's going to give us the leverage to prove something about expressions using induction and to "run the rules backwards" so to speak in order to tear apart a term. Consider for instance the definitions: ----------- zero in Nat n in Nat -------------- succ(n) in Nat Suppose we want to prove some property P holds for all natural numbers. We can use (weak) induction to show that P(zero) holds and that whenever P(n) holds, P(succ(n)) holds. We claim this is sufficient to show that P(n) holds for all n. Suppose not. Then pick the n with the smallest derivation D such that n in Nat is the conclusion. That derivation can't end with an instance of the zero in Nat axiom, because then n = zero and we know that P(zero) holds. So, the derivation must end with an instance of the other rule. That means that n = succ(n') for some n'. Furthermore, we know that P(n') must hold because it's a sub-derivation of D, and D is the smallest derivation whose conclusion does not satisfy P. But if P(n') holds, then we can conclude that P(succ(n')) holds, and thus P(n) holds. The key steps in those induction proofs is that if something is defined inductively, then there's some smallest derivation that puts it in the set. The other key thing is that we can do case analysis since we know there are no other rules that could possibly influence membership in Nat. An alternative way to look at all of this is to think of the collection of inference rule as a function from sets to sets. For instance, the Exp rules can be formulated as the following function: F(S) = S U Z U Id U { (e1 bop e2) | e1,e2 in S, bop in Binop} We are then defining Exp to be the smallest set that is closed with respect to F. That is, we want: Exp = F(Exp) for all S.S = F(S) => Exp subset of S Of course, such an equation isn't a definition anymore than saying x = y + 3 defines x. The equation might have many solutions, or it might have no solutions. So, what we're going to do is define Exp by iteration. Let us define: Exp0 = {} Exp1 = F(Exp0) Exp2 = F(Exp1) ... Expi+1 = F(Expi) Then we want to make Exp the union of all Expi's: Exp = Ui>=0.F^i({}) Now we can see that Exp1 corresponds to those expressions of height 1, Exp2 corresponds to those expressions of height <= 2, , Exp3 is those expressions of height <= 3, etc. We're taking Exp to then be the union of all of these sets. Furthermore, each set that we defined was built out of *pre-existing* sets. We get into trouble when we start to try to tie knots in set definitions. Witness: R = { S | S not in R } If we want to prove a property P holds for all expressions, we can do so by induction on i. In practice, when I have a proof to do, it really means showing that the property I'm trying to prove holds for the axioms, and whenever it holds for the antecdents, I can derive that it holds for the conclusions. If we have mutually inductive definitions, then things get a little trickier because we have to build up the sets together (at the same time.) That can be accomplished by building up a pair of the objects in question. For instance, if I have: e ::= x | let d in e | e1 e2 d ::= x = e ; d | nil then we can formulate an operator: F : Set() -> Set() F(S) = S U ({ | (e == let d' in e' and in S, or e == e1 e2 and <_,e1> in S and <_,e2> in S) and (d = nil or d = let x = d' in e' and in S)}) and define Exp and Decl simultaneously.