Denotational Semantics and Equational Theory of the S.T.L.C.:
Let's begin by giving a denotational semantics to the simplytyped,
callbyvalue lambda calculus, where we interpret types in a set
theoretic sense. In particular let us define:
V[int] = Z
V[t1>t2] = V[t1] > C[t2]
C[t] = V[t]
Here, V[] is a *value* interpretation for types, whereas C[]
is a *computational* interpretation of types. The distinction
isn't important for the SLTC (since every computation terminates
with a value) but will become important when we start adding in
effects and so forth. The equations above define the value meaning
of int (V[int]) as the set of all integers, and the value meaning
of t1>t2 as the set of all settheoretic functions from V[t1]
to C[t2] = V[t2].
Next, we're going to define a semantic function E[] that maps
expressions and environments to values:
Val = U_t(V[t])  union of all value interp's of types
Env = Id ` Val  partial fn from identifiers to values
E : Exp > Env > Val
E[i] r = i
E[x] r = r(x)
E[e1 e2] r = (E[e1]r) (E[e2]r)
E[\x:t.e] r = \v.(E[e] r[x>v])
Of course, I'm using the same notation for the metalevel as the
object level, so that gets to be a bit confusing. Still, you
should get the idea and it's exactly the same way that you'd write
an interpreter in ML.
Defn: if G  e : t, then r is suitable for G if for all x in
Dom(G), r(x) in V[G(x)].
Theorem: if G  e : t, then for all r suitable for G,
r(e) => v iff E[e]r = E[v]r.
Corr: If  e : t and e=>v, then E[e] = E[v].
Note that by "=" here, we mean the settheoretic notion of equality.
So two functions are equal iff they are the same set of input/output
pairs. We'll leave the proof as an exercise...
We can use the denotational semantics to prove that certain equational
laws are valid. In particular, let us establish an equational theory
for lambda terms:
G  e1 = e2 : t
with the goal that the above judgment should only be derivable when
E[e1] = E[e2]. Here are some proof rules:
G  e : t
 (refl)
G  e = e : t
G  e1 = e2 : t
 (sym)
G  e2 = e1 : t
G  e1 = e2 : t G  e2 = e3 : t
 (tran)
G  e1 = e3 : t
G  e1 = e1' : t1>t2 G  e2 = e2' : t1
 (appcong)
G  e1 e2 = e1' e2' : t2
G,x:t1  e1 = e1' : t2
 (lamcong)
G  \x:t1.e1 = \x:t2.e2 : t1>t2
G  (\x:t'.e1) : t'>t G  e2 : t'
 (beta)
G  (\x:t'.e1) e2 = e2[e2/x]
G  e : t1>t2
 (eta) (x not in G)
G  \x:t1.e x = e : t1>t2
The claim is that G  e1 = e2 : t implies E[e1] = E[e2] which
is a pretty straightforward induction. In fact, for this simple
language, it's possible to show that E[e1] = E[e2] implies
that G  e1 = e2 : t (for welltyped terms G  e1 : t and
G  e2 : t) demonstrating that this equational theory is both
sound and complete.
And in fact, the theory is completely decidable. In essence,
we simply apply the beta and eta rules to as many subterms
as possible (orienting the equations from lefttoright) and
we're guaranteed to get out terms that are equal (modulo
alphaconversion) iff their denotations are equal.
That is, we can define a reduction relation e >> e' as follows:
(\x:t.e) e' >> e[e'/x]
(\x:t.e x) >> e (x not in FV(e))
e1 >> e1'

e1 e2 >> e1' e2
e2 >> e2'

e1 e2 >> e1 e2'
e1 >> e1'

\x:t.e1 >> \x:t.e1'
Note that the relation is nondeterministic and that it works *under*
lambdas. Also note that our small step evaluation relation (>) is a
subset of the reduction relation (>>). This ensures that if e1 >*
e2, then e1 >>* e2. It's also easy to see that we can map a
derivation of e1 >>* e2 to a derivation of  e1 = e2 for welltyped
terms since each of the above rules is an instance of the equational
rules, and we can use transitivity to string together the reductions.
So, a strategy for proving that two terms are equal is to use >> as
many times as possible until the terms no longer reduce and then
compare the two terms syntatically (modulo alphaconversion.) That
process will be sound and complete if we can establish two further
things: (a) it doesn't matter what reduction order we use, and (b)
that reduction will always terminate.
We can use a logical relations argument, similar to the one we
used for evaluation, to prove that >> is strongly normalizing.
The proof is not as easy since we're working under lambda's.
However, it's not that much harder.
So now we need to show *confluence*: if e >>* e1 and
e >>* e2 then e1 >>* e' and e2 >>* e'. This tells us that
it doesn't matter how we reduce the term, because by the time
we finish reducing, we'll get the same result. Another way
to say this is that reduction will always lead to a *normal form*.
Assuming we've proved strong normalization, confluence can be
established by showing *local* confluence:
e >> e1 and e >> e2 implies there exists an e' such that
e1 >>* e' and e2 >>* e'.
Note that the untyped lambda calculus also has the confluence
property, but it isn't strongly normalizing, so local confluence
is not sufficient to establish its confluence.
To see that local confluence + strong normalization implies
confluence, we argue as follows: First, given an expression
e, we know from strong normalization that there is a bound b on
the total number of reduction steps we can take. We will
proceed by induction on b.
Base case: Suppose b = 0 (i.e., e is terminal) and we have:
e >>* e1 and e >>* e2
then it must be that e >>0 e1 and e >>0 e2 so e = e1 and e = e2.
Induction: suppose we have confluence for all expressions with
a reduction bound of b and that e is an expression with a reduction
bound of b+1. Suppose further that:
e >>* e1 and e >>* e2
Then we have:
e >>n1 e1 and e >>n2 e2
for some n1 and n2. Note that if n1 = 0 then e = e1 so
e1 >>n2 e2, and thus both e1 and e2 reduce to e2. Similarly,
if n2 is zero, both e1 and e2 reduce to e1.
So suppose n1 > 0 and n2 > 0. Then we have the following
diagram:
>> e1' (n11)>> e1
/

e 

\
>> e2' (n21)>> e2
Now by local confluence we can establish the existance
of an e'':
>> e1' (n11)>> e1
/ \
 \
e  >>* e''
 /
\ /
>> e2' (n21)>> e2
Now e1' and e2' can have reduction bounds at most b since
otherwise, e would have a longer reduction than b+1. So
our induction hypothesis applies to both e1' and e2'.
That means that there exists e1'' and e2'' such that:
>> e1' (n11)>> e1>>* e1''
/ \ /
 \ /
e  >>* e''
 / \
\ / \
>> e2' (n21)>> e2 >>* e2''
Applying the induction hypothesis to e'' again (since
e'' must have a reduction bound less than b) we can
thus find an e' such that:
>> e1' (n11)>> e1>>* e1''
/ \ / \
 \ / \
e  >>* e'' >>* e'
 / \ /
\ / \ /
>> e2' (n21)>> e2 >>* e2''
So all that remains is to establish local confluence.
This can be done on a casebycase basis by looking at
all possible pairs of reductions that can be performed
on a given term. For instance, suppose
e1 >> e1' and e2 >> e2'
so we have:
>> e1' e2
/
e1 e2 
\
>> e1 e2'
It's pretty clear that we can apply the same reductions
in the opposite order to get to e1' e2':
>> e1' e2 
/ \
e1 e2  >> e1' e2'
\ /
>> e1 e2' 
since the reductions don't overlap. In fact, a careful
case analysis reveals only a couple of situations where
we have *critical pairs* in our reduction. For example:
(\x.e1) e2 >> (\x.e1) e2'
(\x.e1) e2 >> e1[e2/x]
Putting these two together means we need to prove a lemma
about reduction commuting substitution (i.e., that e1[e2/x] >>*
e1[e2'/x]. Similarly, we'll need to establish that
when e1 >> e1', then e1[e2/x] >>* e1'[e2/x].
In general, you'll need to do a tedious case analysis of
all the possible ways that the rules can interact to show
local confluence. For this little language, it's not that
bad, but it's clear that this analysis becomes difficult as
we add more kinds of (overlaping) reduction relations to the
language.
To sum up: it's possible to automatically prove whether two
STLC terms are equivalent by normalizing them and then comparing
for equality. The justification for this is given by an
equational theory that is both sound and complete, and is
easily shown thus by using the denotational semantics which
was proven adequate w.r.t. the operational semantics.
You may be wondering why we went to all of this trouble given
that decideability is going to get tossed out the door when
we start adding in more elaborate expressions (e.g., recursion,
refs, etc.) The reason is that we'll be using variants of
this language at the *type* level later on, where we'll need
to establish that two types are equal as part of typechecking.
It's worth noting that this equational theory scales up in
some interesting ways. In particular, it's possible to add
support for products (pairs) with the key reduction rules
being:
G  : t1*t2

G  #1 = e1 : t1
G  e : t1*t2

G  <#1 e, #2 e> : t1*t2
However, adding support for unit (or zero) and sums is somewhat
problematic. For instance, consider that we should have rules
for unit like this:
G  e : unit

G  e = () : unit
But it becomes clear that our normalizer must now take into account
types. That is, we can't just use straightforward rewriting on
the untyped terms to figure out when we should replace an
expression e with (). Note that this problem also arises for
ML in that when we see #1 x in ML, we can't tell locally whether
x is a pair or a triple or ...
It's worth thinking about what the normal forms for sums should
be and why that's so hard. Consider that to get completeness,
we'll need rules like:
(case e of inl(x) => inl(x)  inr(x) => inr(x)) == e
In addition, consider that when we have:
(case e1 of inl(x1) => (case e2 of inl(x2) => e1
 inr(x2) => e2)
 inr(x1) => (case e2 of inl(x2) => e3
 inr(x2) => e3))
we could just as well do the case analysis on e2 before
we do the case analysis on e1. So any notion of normal
forms is going to have to be modulo "rotations" of case
expressions.
Thought: (t1 + t2) > t =~= (t1 > t) * (t2 > t)
More on this later...