Before we start formalizing proofs about languages, we first need to
better formalize what we mean by inference rules. (Winskell goes into
much more detail than I do here.)
Inference rules let us define sets in an inductive fashion. Last
time, we defined our small-step transition relation, which is a
set of pairs of configurations, using inference rules. We defined
our abstract syntax for the language using BNF, but this is really
short-hand for inference rules as well. For example, we can define
the set of all expressions in IMP using inference rules as follows:
i in Z
(1) --------
i in Exp
x in Id
(2) --------
x in Exp
e1 in Exp bop in Binop e2 in Exp
(3) -----------------------------------
(e1 bop e2) in Exp
The stuff above the line is called the antecedent, and the stuff
below the line is called the conclusion. We read a rule like (3)
as saying:
If e1 is in Exp, bop is in Binop, and e2 is in Exp,
then (e1 bop e2) is in Exp.
Note that in these inference rules, we're using meta-level variables
such as e, i, x, and bop. An *instance* of an inference rule is obtained
by substituting object-level terms for these variables. For instance,
we can subsitute 3 for i in rule 1 to get the instance:
3 in Z
--------
3 in Exp
Typically, a rule generates a bunch of instances.
A *derivation* D is a (finite) non-empty tree of instances of the rules.
By finite, I mean that the tree has finite height and finite width.
A *valid derivation* is one where the root of the tree is an instance
of the form:
A1 A2 .... An
--------------
C
and for each Ai, there is a valid sub-derivation whose conclusion is Ai.
The leaves of the tree are thus axioms (instances of rules with no
antecdents.) We write |- C (C is provable) to denote that there
exists a valid derivation D with conclusion C.
In our expression rules above, there are no axioms, so we need to add
some way to derive that an integer is in Z and an identifier is in Id.
Formally, I need to define Z and Id, just as I did for Exp. For
instance, I could define Z as follows:
n in PosNat
-----------
n in Z
n in PosNat
------------
neg(n) in Z
---------
zero in Z
-------------
one in PosNat
n in PosNat
-----------------
succ(n) in PosNat
Usually, we don't go into this level of detail and instead, assume we
have some pre-existing sets lying around that we can use to terminate
the derivations.
One more detail about the inference rules is crucial: When we say Exp
is generated by the inference rules (1), (2), and (3), we mean that
Exp is the *smallest* set of e's such that |- e in Exp. That is, the
meaning of a set of inference rules is that it is the smallest set of
C's such that |- C. In other words, if e in Exp, then it *has* to
have been generated from one of the rules (1), (2), or (3) in a finite
derivation.
It's important to note that the smallest bit -- that's what's going to
give us the leverage to prove something about expressions using
induction and to "run the rules backwards" so to speak in order to
tear apart a term. Consider for instance the definitions:
-----------
zero in Nat
n in Nat
--------------
succ(n) in Nat
Suppose we want to prove some property P holds for all natural
numbers. We can use (weak) induction to show that P(zero) holds
and that whenever P(n) holds, P(succ(n)) holds. We claim this
is sufficient to show that P(n) holds for all n.
Suppose not. Then pick the n with the smallest derivation D such that
n in Nat is the conclusion. That derivation can't end with an
instance of the zero in Nat axiom, because then n = zero and we know
that P(zero) holds. So, the derivation must end with an instance of
the other rule. That means that n = succ(n') for some n'.
Furthermore, we know that P(n') must hold because it's a
sub-derivation of D, and D is the smallest derivation whose conclusion
does not satisfy P. But if P(n') holds, then we can conclude that
P(succ(n')) holds, and thus P(n) holds.
The key steps in those induction proofs is that if something
is defined inductively, then there's some smallest derivation
that puts it in the set. The other key thing is that we can
do case analysis since we know there are no other rules that
could possibly influence membership in Nat.
An alternative way to look at all of this is to think of the
collection of inference rule as a function from sets to sets. For
instance, the Exp rules can be formulated as the following function:
F(S) = S U Z U Id U { (e1 bop e2) | e1,e2 in S, bop in Binop}
We are then defining Exp to be the smallest set that is closed with
respect to F. That is, we want:
Exp = F(Exp)
for all S.S = F(S) => Exp subset of S
Of course, such an equation isn't a definition anymore than saying
x = y + 3 defines x. The equation might have many solutions,
or it might have no solutions. So, what we're going to do is define
Exp by iteration. Let us define:
Exp0 = {}
Exp1 = F(Exp0)
Exp2 = F(Exp1)
...
Expi+1 = F(Expi)
Then we want to make Exp the union of all Expi's:
Exp = Ui>=0.F^i({})
Now we can see that Exp1 corresponds to those expressions of height 1,
Exp2 corresponds to those expressions of height <= 2, , Exp3 is those
expressions of height <= 3, etc. We're taking Exp to then be the union
of all of these sets. Furthermore, each set that we defined was built
out of *pre-existing* sets. We get into trouble when we start to try
to tie knots in set definitions. Witness:
R = { S | S not in R }
If we want to prove a property P holds for all expressions, we can
do so by induction on i. In practice, when I have a proof to do,
it really means showing that the property I'm trying to prove holds
for the axioms, and whenever it holds for the antecdents, I can
derive that it holds for the conclusions.
If we have mutually inductive definitions, then things get a little
trickier because we have to build up the sets together (at the same
time.) That can be accomplished by building up a pair of the objects
in question. For instance, if I have:
e ::= x | let d in e | e1 e2
d ::= x = e ; d | nil
then we can formulate an operator:
F : Set() -> Set()
F(S) = S U ({ | (e == let d' in e' and in S,
or
e == e1 e2 and <_,e1> in S and <_,e2> in S)
and
(d = nil or d = let x = d' in e' and in S)})
and define Exp and Decl simultaneously.