Denotational Semantics and Equational Theory of the S.T.L.C.: Let's begin by giving a denotational semantics to the simply-typed, call-by-value lambda calculus, where we interpret types in a set theoretic sense. In particular let us define: V[int] = Z V[t1->t2] = V[t1] -> C[t2] C[t] = V[t] Here, V[-] is a *value* interpretation for types, whereas C[-] is a *computational* interpretation of types. The distinction isn't important for the SLTC (since every computation terminates with a value) but will become important when we start adding in effects and so forth. The equations above define the value meaning of int (V[int]) as the set of all integers, and the value meaning of t1->t2 as the set of all set-theoretic functions from V[t1] to C[t2] = V[t2]. Next, we're going to define a semantic function E[-] that maps expressions and environments to values: Val = U_t(V[t]) -- union of all value interp's of types Env = Id -` Val -- partial fn from identifiers to values E : Exp -> Env -> Val E[i] r = i E[x] r = r(x) E[e1 e2] r = (E[e1]r) (E[e2]r) E[\x:t.e] r = \v.(E[e] r[x|->v]) Of course, I'm using the same notation for the meta-level as the object level, so that gets to be a bit confusing. Still, you should get the idea and it's exactly the same way that you'd write an interpreter in ML. Defn: if G |- e : t, then r is suitable for G if for all x in Dom(G), r(x) in V[G(x)]. Theorem: if G |- e : t, then for all r suitable for G, r(e) => v iff E[e]r = E[v]r. Corr: If |- e : t and e=>v, then E[e] = E[v]. Note that by "=" here, we mean the set-theoretic notion of equality. So two functions are equal iff they are the same set of input/output pairs. We'll leave the proof as an exercise... We can use the denotational semantics to prove that certain equational laws are valid. In particular, let us establish an equational theory for lambda terms: G |- e1 = e2 : t with the goal that the above judgment should only be derivable when E[e1] = E[e2]. Here are some proof rules: G |- e : t -------------- (refl) G |- e = e : t G |- e1 = e2 : t ---------------- (sym) G |- e2 = e1 : t G |- e1 = e2 : t G |- e2 = e3 : t ---------------------------------- (tran) G |- e1 = e3 : t G |- e1 = e1' : t1->t2 G |- e2 = e2' : t1 ------------------------------------------- (app-cong) G |- e1 e2 = e1' e2' : t2 G,x:t1 |- e1 = e1' : t2 --------------------------------- (lam-cong) G |- \x:t1.e1 = \x:t2.e2 : t1->t2 G |- (\x:t'.e1) : t'->t G |- e2 : t' -------------------------------------- (beta) G |- (\x:t'.e1) e2 = e2[e2/x] G |- e : t1->t2 ---------------------------- (eta) (x not in G) G |- \x:t1.e x = e : t1->t2 The claim is that G |- e1 = e2 : t implies E[e1] = E[e2] which is a pretty straightforward induction. In fact, for this simple language, it's possible to show that E[e1] = E[e2] implies that G |- e1 = e2 : t (for well-typed terms G |- e1 : t and G |- e2 : t) demonstrating that this equational theory is both sound and complete. And in fact, the theory is completely decidable. In essence, we simply apply the beta and eta rules to as many sub-terms as possible (orienting the equations from left-to-right) and we're guaranteed to get out terms that are equal (modulo alpha-conversion) iff their denotations are equal. That is, we can define a reduction relation e ->> e' as follows: (\x:t.e) e' ->> e[e'/x] (\x:t.e x) ->> e (x not in FV(e)) e1 ->> e1' ---------------- e1 e2 ->> e1' e2 e2 ->> e2' ---------------- e1 e2 ->> e1 e2' e1 ->> e1' -------------------- \x:t.e1 ->> \x:t.e1' Note that the relation is non-deterministic and that it works *under* lambdas. Also note that our small step evaluation relation (->) is a subset of the reduction relation (->>). This ensures that if e1 ->* e2, then e1 ->>* e2. It's also easy to see that we can map a derivation of e1 ->>* e2 to a derivation of |- e1 = e2 for well-typed terms since each of the above rules is an instance of the equational rules, and we can use transitivity to string together the reductions. So, a strategy for proving that two terms are equal is to use ->> as many times as possible until the terms no longer reduce and then compare the two terms syntatically (modulo alpha-conversion.) That process will be sound and complete if we can establish two further things: (a) it doesn't matter what reduction order we use, and (b) that reduction will always terminate. We can use a logical relations argument, similar to the one we used for evaluation, to prove that ->> is strongly normalizing. The proof is not as easy since we're working under lambda's. However, it's not that much harder. So now we need to show *confluence*: if e ->>* e1 and e ->>* e2 then e1 ->>* e' and e2 ->>* e'. This tells us that it doesn't matter how we reduce the term, because by the time we finish reducing, we'll get the same result. Another way to say this is that reduction will always lead to a *normal form*. Assuming we've proved strong normalization, confluence can be established by showing *local* confluence: e ->> e1 and e ->> e2 implies there exists an e' such that e1 ->>* e' and e2 ->>* e'. Note that the untyped lambda calculus also has the confluence property, but it isn't strongly normalizing, so local confluence is not sufficient to establish its confluence. To see that local confluence + strong normalization implies confluence, we argue as follows: First, given an expression e, we know from strong normalization that there is a bound b on the total number of reduction steps we can take. We will proceed by induction on b. Base case: Suppose b = 0 (i.e., e is terminal) and we have: e ->>* e1 and e ->>* e2 then it must be that e ->>0 e1 and e ->>0 e2 so e = e1 and e = e2. Induction: suppose we have confluence for all expressions with a reduction bound of b and that e is an expression with a reduction bound of b+1. Suppose further that: e ->>* e1 and e ->>* e2 Then we have: e ->>n1 e1 and e ->>n2 e2 for some n1 and n2. Note that if n1 = 0 then e = e1 so e1 ->>n2 e2, and thus both e1 and e2 reduce to e2. Similarly, if n2 is zero, both e1 and e2 reduce to e1. So suppose n1 > 0 and n2 > 0. Then we have the following diagram: --->> e1' ---(n1-1)-->> e1 / | e -- | \ --->> e2' ---(n2-1)-->> e2 Now by local confluence we can establish the existance of an e'': --->> e1' ---(n1-1)-->> e1 / \ | \ e -- ----------->>* e'' | / \ / --->> e2' ---(n2-1)-->> e2 Now e1' and e2' can have reduction bounds at most b since otherwise, e would have a longer reduction than b+1. So our induction hypothesis applies to both e1' and e2'. That means that there exists e1'' and e2'' such that: --->> e1' ---(n1-1)-->> e1------->>* e1'' / \ / | \ / e -- ----------->>* e''-- | / \ \ / \ --->> e2' ---(n2-1)-->> e2 ------>>* e2'' Applying the induction hypothesis to e'' again (since e'' must have a reduction bound less than b) we can thus find an e' such that: --->> e1' ---(n1-1)-->> e1------->>* e1'' / \ / \ | \ / \ e -- ----------->>* e''-- -->>* e' | / \ / \ / \ / --->> e2' ---(n2-1)-->> e2 ------>>* e2'' So all that remains is to establish local confluence. This can be done on a case-by-case basis by looking at all possible pairs of reductions that can be performed on a given term. For instance, suppose e1 ->> e1' and e2 ->> e2' so we have: --->> e1' e2 / e1 e2 --- \ --->> e1 e2' It's pretty clear that we can apply the same reductions in the opposite order to get to e1' e2': --->> e1' e2 -- / \ e1 e2 --- --->> e1' e2' \ / --->> e1 e2' -- since the reductions don't overlap. In fact, a careful case analysis reveals only a couple of situations where we have *critical pairs* in our reduction. For example: (\x.e1) e2 ->> (\x.e1) e2' (\x.e1) e2 ->> e1[e2/x] Putting these two together means we need to prove a lemma about reduction commuting substitution (i.e., that e1[e2/x] ->>* e1[e2'/x]. Similarly, we'll need to establish that when e1 ->> e1', then e1[e2/x] ->>* e1'[e2/x]. In general, you'll need to do a tedious case analysis of all the possible ways that the rules can interact to show local confluence. For this little language, it's not that bad, but it's clear that this analysis becomes difficult as we add more kinds of (overlaping) reduction relations to the language. To sum up: it's possible to automatically prove whether two STLC terms are equivalent by normalizing them and then comparing for equality. The justification for this is given by an equational theory that is both sound and complete, and is easily shown thus by using the denotational semantics which was proven adequate w.r.t. the operational semantics. You may be wondering why we went to all of this trouble given that decideability is going to get tossed out the door when we start adding in more elaborate expressions (e.g., recursion, refs, etc.) The reason is that we'll be using variants of this language at the *type* level later on, where we'll need to establish that two types are equal as part of type-checking. It's worth noting that this equational theory scales up in some interesting ways. In particular, it's possible to add support for products (pairs) with the key reduction rules being: G |- : t1*t2 ------------------------- G |- #1 = e1 : t1 G |- e : t1*t2 ------------------------- G |- <#1 e, #2 e> : t1*t2 However, adding support for unit (or zero) and sums is somewhat problematic. For instance, consider that we should have rules for unit like this: G |- e : unit ------------------ G |- e = () : unit But it becomes clear that our normalizer must now take into account types. That is, we can't just use straightforward rewriting on the untyped terms to figure out when we should replace an expression e with (). Note that this problem also arises for ML in that when we see #1 x in ML, we can't tell locally whether x is a pair or a triple or ... It's worth thinking about what the normal forms for sums should be and why that's so hard. Consider that to get completeness, we'll need rules like: (case e of inl(x) => inl(x) | inr(x) => inr(x)) == e In addition, consider that when we have: (case e1 of inl(x1) => (case e2 of inl(x2) => e1 | inr(x2) => e2) | inr(x1) => (case e2 of inl(x2) => e3 | inr(x2) => e3)) we could just as well do the case analysis on e2 before we do the case analysis on e1. So any notion of normal forms is going to have to be modulo "rotations" of case expressions. Thought: (t1 + t2) -> t =~= (t1 -> t) * (t2 -> t) More on this later...