CPS Conversion: Just as it's possible to compile exceptions away, it's also possible to compile away letcc. The translation is known as CPS conversion and this is a very interesting translation that has lots of good properties, both from a theoretical and practical perspective. Let us begin by defining our type translation (I'm going to restrict myself to a suitable subset of the language:) V[unit] = unit V[t1*t2] = V[t1]*V[t2] V[t1->t2] = V[t1] -> C[t2] V[cont(t)] = C[t] C[t] = (V[t] -> Ans) -> Ans So a computation in our target language will be a function from continuations (i.e., stacks) to answers. The continuation will take in the value that we're going to "return". Here's one possible translation (due to Fisher): E[()] = \k.k () E[x] = \k.k x E[(e1,e2)] = \k.E[e1](\v1.E[e2](\v2.k (v1,v2))) E[#i e] = \k.E[e](\v.k (#i v)) E[\x:t.e] = \k.k (\x:V[t].\c.E[e] c) E[e1 e2] = \k.E[e1](\v1.E[e2](\v2.v1 v2 k) E[letcc x in e] = \k.(E[\x.e] k k) E[throw e1 e2] = \k.E[e1](\v1.E[e2](\v2.v1 v2) We translate a closed program e by providing it with an initial empty stack: P[e] = E[e] (\v.v) A few things to note about the translation: Notice that when we perform an elimination operation (e.g., function call or projection off a tuple) we're always manipulating *values* or variables. In effect, the translation is automatically naming all of the intermediate computations, much like the lowering phase of a compiler from tree code to some linear representation. Second, all of the function calls are tail-calls. (Well, almost -- the curried application for E[e1 e2] is technically compound but in practice, we can avoid this by using a tuple.) So, if you ran this on our stack-based abstract machine, it would never allocate a stack frame! Notice that in the letcc rule we duplicate the continuation that we are given. We pass it to \x.e once as the let-bound value, and again as the current "stack". This is where the need to support stack copying comes into play. If you omit the letcc rule, you'll notice that we *never* copy the continuation -- it remains linear throughout the code. It's the linearity that makes it possible to do updates in-place on the stack for a language without 1st-class continuations. ------------------------------------------------------------------ The Danvy-Filinski Translation: Now this isn't the most efficient translation because it will introduce a lot of administrative reductions. For instance, if we have: (\x.x) () then this will blow up into: \k1.(\k2.k2 (\x.\c.(\k3.k3 x) c) ... and there are lots of little beta-reductions that we can eliminate. We could do this in a post-pass or we could try to arrange the translation so that these extra reductions don't show up in the target code. The key is to make a distinction between the meta-level and the target level. In some sense, the translation itself is written in a continuation-passing style where the initial \k is a meta-level continuation. By making a distinction we can avoid this confusion. So here's an alternative (wrong) translation where I will use capital letters for meta-level continuation variables, fn v => e for meta-level continuations, and write K[e] for meta-level application: E[()] K = K[()] E[x] K = K[x] E[(e1,e2)] K = E[e1](\v1.E[e2](\v2.K[(v1,v2)])) E[#i e] K = E[e](\v.K[#i v]) E[\x:t.e] K = K[\x:V[t].\c.E[e] c] E[e1 e2] K = E[e1](\v1.E[e2](\v2.v1 v2 K) We immediately see that this doesn't type-check because for the E[\x:t.e] rule, we're passing an object-level continuation c as an argument to E[e]. The trick is to eta-expand: E[\x:t.e] K = K[\x:V[t].\c.E[e] (fn v => c v)] So notice that we're passing in a *meta-level* lambda to E here. Next, we see a problem with E[e1 e2] because we're using a meta-level continuation as the argument to an object-level function: v1 v2 K. Again, eta does the trick, but this time at the object level: E[e1 e2] K = E[e1](\v1.E[e2](\v2.v1 v2 (\v.K[v])) This final translation: E[()] K = K[()] E[x] K = K[x] E[(e1,e2)] K = E[e1](\v1.E[e2](\v2.K[(v1,v2)])) E[#i e] K = E[e](\v.K[#i v]) E[\x:t.e] K = K[\x:V[t].\c.E[e] (fn v => c v)] E[e1 e2] K = E[e1](\v1.E[e2](\v2.v1 v2 (\v.K[v])) is due to Danvy and Filinski and is a beautiful piece of work because in one pass it eliminates all of those pesky administrative reductions. Or in compiler terms, it produces good code without having to do a post-pass to clean things up. ------------------------------------------------------------------ Correctness of the Translation: We would like to prove that whenever |- e : int, then e => i iff E[e](fn v => v) => i. There are at least two possible ways that we can prove the correctness of this translation: One approach is to try to prove a syntactic diamond property: e1 -> e2 implies E[e1](fn v => v) ->* E[e2](fn v => v) Let's try this by induction on e1: case e1 = x: can't happen case e1 = i: e1 can't step. case e1 = ea eb There are 3 cases to consider: 1. suppose ea -> ea'. We must show (E[ea eb](fn v => v)) ->* E[ea' eb](fn v => v). From the translation we know that: E[ea eb](fn v => v) = E[ea](fn va => E[eb](fn vb => (va vb (\v.v)))) and E[ea' eb](fn v => v) = E[ea'](fn va => E[eb](fn vb => (va vb (\v.v)))) Aha! Now we see our induction hypothesis is not strong enough since we're not using the identity meta-continuation for E[ea]. So let's go back and try to strengthen the hypotheses: e1 -> e2 implies for any K, E[e1]K ->* E[e2]K Now we must show (E[ea eb]K) ->* E[ea' eb]K. Again, from the translation we have: E[ea eb]K = E[ea](fn va => E[eb](fn vb => (va vb (\v.K[v])))) and E[ea' eb]K = E[ea'](fn va => E[eb](fn vb => (va vb (\v.K[v])))) By induction, we know since ea -> ea', that E[ea]K ->* E[ea']K for any K. Picking K = (fn va => E[eb](fn vb => (va vb (\v.K[v])))) yields the result. 2. suppose ea = va and eb -> eb'. We must show E[va eb]K ->* E[va eb']K. We have E[va eb]K = E[va](fn va => E[eb](fn vb => (va vb (\v.K[v])))) = E[eb](fn vb => (E[va] vb (\v.K[v]))) and E[va eb']K = E[va](fn va => E[eb'](fn vb => (va vb (\v.K[v])))) = E[eb'](fn vb => (E[va] vb (\v.K[v]))) By induction, we know since eb -> eb' that E[eb]K ->* E[ev']K for any K. Picking K = (fn vb => (va vb (\v.K[v]))) does the trick. Note that here, we're able to reduce the meta-level application because we know that E[va] always yields a value. If we didn't have the optimized translation, we could only conclude that what we're getting out is beta-eta equal to what the translation yields. 3. suppose ea = (\x.e) and eb = vb so e2 = e{vb/x}. Then we have: E[(\x.e) vb]K = E[\x.e](fn va => E[vb](fn vb => (va vb (\v.K[v])))) = \x.\k.(E[e](fn v => k v)) E[vb] \v.K[v] ->* (E[e]K){E[vb]/x} So now all we have to do is argue that the translation commutes with substitution and we're done. One interesting thing about this argument is that it didn't use types at all -- so the result is valid for both the simply-typed and the untyped lambda calculus. However, it's also important to note that the proof only went through because we cleverly optimized the translation. If we tried to use the original Fisher translation, then we'd have to argue that the results were semantically equivalent using the equational theory at the object level. In turn, that would require that someone prove that beta-value and eta-value hold for the language---not trivial for the untyped lambda calculus. An alternative approach to proving the correctness of even the unoptimized translation is to use a logical relations argument. The advantage of logical relations is that it's not so sensitive to the syntactic details of the translation. The disadvantage is that it can often be quite difficult to (a) set up the relations to be strong enough to prove what we want, or (b) show that indeed the translation satisfies the relation. Nonetheless, this is a very powerful technique with which you should be familiar. In our LR proof, we're going to construct a simulation relations between our stack-based machine and CPS terms. In particular, let us define: EM = { (,e2) | => iff e2 => i } This says that a stack machine configuration is EM-related to a CPS term e2 iff they both evaluate to the same integer i. Our ultimate goal is to show that if |- e : int, then (,E[e](fn k => k)) is in EM. To do so, we'll define some useful, type-indexed auxiliarly relations: EV[int] = { (i,i) } EV[t1 -> t2] = { (f1,f2) | forall (a1,a2) in EV[t1] forall (S,k) in EC[t2]. (,f2 a2 k) in EM } EC[t] = {(S,k) | forall (v1,v2) in EV[t].(,k v2) in EM} So two values (v1,v2) are in EV[int] iff v1 and v2 are the same integers. Two values (v1,v2) are in EV[t1->t2] iff when given related arguments a1 and a2, and related contexts S and k, and run, the results produce the same integer. We lift the value relation to contexts by saying that two substitutions g1 and g2 mapping variables to values are related at G when: EV[G] = { (g1,g2) | forall x in Dom(G).(g1(x),g2(x)) in EV[G(x)] } Finally, the theorem we want to show is: Thm: if G |- e : t, then for all (g1,g2) in EV[G], and for all (S,k) in EC[t], (,g2(E[e]) k) in EM. If we can establish this then as an immediate corrollary we have: Corr: If |- e : int then => i iff E[e](\v.v) => i. Proof: picking g1 = g2 to be the empty substitution, we have via the theorem above that for all (S,k) in EC[int], (,E[e] k) in EM which means => iff E[e] k => i. So it suffices to show that (nil,\v.v) in EC[int]. Let (a1,a2) in EV[int]. Then a1 = a2 = j for some j. Then => and E[j](\x.x) = (\k.k j)(\x.x) => j. So (nil,\x.x) in EC[int]. We leave the proof of the rest of the theorem as an exercise.