CS252r: Advanced Functional Language Compilation

Fall 2012: Maxwell Dworkin 323, Mon-Wed-Fri 3-4pm

Greg Morrisett




Background Reading and Resources


Continuation Passing Style

Simple Reductions

Closure Conversion

Code Generation

Runtime Systems






This semester's version of CS252r will be a group-oriented, project course aimed at building a compiler for a functional language. In particular, the class as a whole will be building an optimizing compiler for the Coq Proof Assistant.

As a development environment, Coq lets you write functional programs with very rich, dependent types, and to write formal, machine-checked proofs about those programs. Furthermore, Coq lets you extract the functional bits into either Haskell, Scheme, or OCaml code that can then be executed.

Unfortunately, the quality of the extracted code is not that good, and existing compilers do a poor job of getting rid of the inefficiencies. In part, this is because extracted Coq code has some idioms that do not arise in hand-written Haskell, Scheme, or OCaml code, and in part, because the compilers for these languages lack deep knowledge about the semantics of the code. For instance, because all Coq functions are effect free and terminate, we can evaluate the function using call-by-value, call-by-name, or call-by-need, whereas Haskell, Scheme, and OCaml must pick only one of these evaluation strategies.

Along the way, students will learn the basics of how to build a functional language compiler (e.g., CPS and closure conversion), as well as key topics in program analysis and optimization. We are going to develop the compiler as a group, so students will also gain practical team skills, such as how to perform code reviews.

We will also be reading a number of classic functional langugage compiler papers on topics ranging from intermediate representations to run-time systems.


We will develop the compiler in Coq, though we will not attempt to prove anything about Coq. Writing code in Coq is similar to writing in OCaml, so most students will have no trouble adapting, as long as they have experience with some ML dialect. Ideally, students should've taken CS153 (Compilers), but exceptions will be entertained at the discretion of the instructor.

Format and Assessment

Programming tasks for the compiler will be categorized into one of two tiers. Tier 1 tasks must be implemented individually by each student, whereas Tier 2 tasks will be done by a group of students. Each student will work on at least two or three Tier 2 tasks.

Tier 1 tasks include the following:

This will give you a working compiler and run-time system, though hardly optimal.

Tier 2 tasks include at least the following:

though students are free to propose their own tasks.