Research
Recent developments in programming languages have broad implications
for the safety and reliability of computer software as well as the
productivity and competitiveness of the software industry.
Trends in handheld devices like PDAs and cell phones suggest that
future applications will demand sophisticated programming languages on
diverse hardware platforms.
In a world of rapidly changing languages and hardware, we cannot
afford to spend years crafting a custom implementation of each
language on each hardware platform.
Instead, we must create
implementation tools and techniques that can quickly and easily support
new languages and new platforms.
This is the goal of my research.
My current work focuses on compiler construction:
- The ideal compiler infrastructure should support not only a
variety of target machines but also a variety of programming
languages.
Past infrastructures have focused only on what happens at
compile time.
But to support such language features as garbage collection, exception
dispatch, and concurrency, run-time support is needed.
I have proposed providing such support not by upcalls but through
a run-time interface, which reveals to the run-time
system the important decisions made in the compiler's back end.
This new organization should make compiler infrastructure
significantly easier to reuse with multiple languages.
This idea is being used in the
C-- project.
- Compiler writers have long used the idea of ``machine
description'' to make a compiler easier to port to new machines.
But a typical machine description actually describes an
algorithm, not a machine, and does so in terms of a
particular compiler's intermediate code—so it can't be reused.
In the languages
SLED and
Lambda-RTL,
I have pioneered declarative machine
descriptions, which say only what the machine does
and do so in terms that are independent of particular intermediate
codes.
Such descriptions are much easier to reuse, but the price you pay is
that the description requires more thorough analysis to be useful in a
compiler.
My students and I are developing several analyses for this purpose.
In addition to compilers, I have also done work on
interpreters,
debuggers,
other machine-level tools,
language design,
and
formal methods.
The full story can be found among my papers.
My research output often includes tools you can download and use.
- Quick
C-- includes a compiler and interpreter for C--.
If you are interested in implementing a high-level language, try
translating it to C-- and let us give you native code for several platforms.
- Lua-ML
is an implementation of the Lua
scripting language in Objective
Caml.
It is useful for scripting and configuring Caml programs.
- The New Jersey Machine-Code Toolkit
implements SLED and is useful for building code that has to analyze or
emit machine instructions: assemblers, disassemblers, binary
translators, debuggers, and the like.
- Noweb is a simple, extensible
literate-programming tool.
It supports any programming language and is widely used.
I have also written the ldb debugger
and a translator for the Lambda-RTL
machine-description language, but these tools are research prototypes
and may not be useful right out of the box.
Finally, NbibTeX has no research content, but it
is a nice replacement for BibTeX.
I was a charter member of the Triforce
group, which included Greg Morrisett,
Mike Smith, and many others.
My research has been supported by
an Alfred P. Sloan Research Fellowship,
by grants from the National Science
Foundation, from DARPA, and from the Australian Research Council,
and by fellowships
from the Fannie and John Hertz Foundation
and from AT&T Bell Labs.
Selected papers
I have collected both highly significant and recent papers.
Links are to abstracts so you can check
out the topic without downloading a monster.
For a complete view, including older work, see my
publications list.
Five most significant papers
- Relocating Machine
Instructions by Currying.
Proceedings of the ACM
SIGPLAN '96 Conference on Programming Language Design and Implementation,
in SIGPLAN Notices, 31(5):226–236, May 1996.
This paper connects two worlds: the ``low cult'' of systems programming
and the pure, mathematical world of the lambda-calculus. The key insight is
that relocation, which is a low-level operation performed on binary code, is
an instance of currying, which is the expression of a multiple-argument
function in the lambda-calculus.
- A Single Intermediate Language
That Supports Multiple Implementations of Exceptions (with Simon L. Peyton
Jones).
Proceedings of the ACM SIGPLAN '00 Conference on Programming Language
Design and Implementation, in SIGPLAN Notices,
35(5):285–298, May 2000.
This paper is the most technical and rigorous of the C-- papers.
It exemplifies what I am trying to achieve in C--: clean, low-level
mechanisms that compiler writers can use to implement different
high-level–language features and to control cost tradeoffs.
- Stochastic Lambda Calculus and
Monads of Probability Distributions (with Avi Pfeffer).
Proceedings of the 29th ACM Symposium on the Principles of Programming
Languages, in SIGPLAN Notices, 37(1):154–165, January
2002.
This paper explores the design of probabilistic languages in a
foundational, principled way. Its special contribution is to analyze
important implementation techniques in a way that is completely formal and is
rigorously connected to the theory of probability.
- A Transformational Approach to
Binary Translation of Delayed Branches (with Cristina Cifuentes).
ACM Transactions on Programming Languages and Systems,
25(2):210–224, March 2003.
The paper solves a small but difficult problem in analysis of binary
codes. This problem is repeatedly a stumbling block for industry groups that
work with binary codes, and I believe our solution is definitive.
- An Expressive Language of
Signatures (with Kathleen Fisher and Paul Govereau).
In Proceedings of the Tenth
ACM SIGPLAN International Conference on Functional Programming
(ICFP'05), pages 27–40, September 2005.
In this paper, we identified an important class of programming problems
the solutions to which cannot be expressed in current languages, and we
showed that these problems can be solved by new mechanisms that cohere with
an existing language.
Five other significant papers
- Specifying Representations of
Machine Instructions (with Mary F.
Fernández).
ACM Transactions on Programming Languages and Systems,
19(3):492–524, May 1997.
This is the most technical and the definitive description of my early
work on declarative machine descriptions.
- An Algebraic Approach to File
Synchronization (with Elöd Csirmaz).
In Proceedings of the 8th European Software Engineering Conference (ESEC)
and 9th ACM SIGSOFT Symposium on the Foundations of Software Engineering
(FSE-9), pages 175–185, September 2001.
This paper discusses how to maintain consistency among multiple
replicas of files that may be modified concurrently. We proposed reasoning
about this problem by examining the algebraic structure of a sequence of
modifications.
- A Generalized Algorithm for
Graph-Coloring Register Allocation (with Michael D. Smith and
Glenn Holloway).
ACM SIGPLAN '04
Conference on Programming Language Design and Implementation, in
SIGPLAN Notices, 39(6):277–288, June 2004.
A new technique for dealing with irregularities in target machines,
which arise because not all machine registers can be used interchangeably. We
hope this will be the definitive paper on handling irregular register files
in a graph-coloring register allocator.
- Staged Allocation: A Compositional
Technique for Specifying and Implementing Procedure Calling Conventions
(with Reuben Olinsky and Christian Lindig).
In Proceedings of the
33rd ACM Symposium on the Principles of Programming Languages,
pages 409–421, January 2006.
A specification language for parameter passing, unique in having a
formal semantics. Together with an earlier paper on stack-frame layout, a
complete approach to calling conventions.
- Embedding an Interpreted Language
Using Higher-Order Functions and Types.
Journal of Functional
Programming, 2008.
To appear. (A previous version appeared
in ACM SIGPLAN 2003
Workshop on Interpreters, Virtual Machines and Emulators.).
Homage to Olivier Danvy: How to
embed an application-specific function into an interpreter simply by
describing its type. The paper, while not technically deep, shows the
capabilities of functional languages to elegant advantage, and it is
representative of my work on interpreters.
Other recent papers
These are some recent papers not listed above.
They are in chronological order,
so the most recent work is at the bottom.
- Widening Integer Arithmetic
(with Kevin Redwine).
In 13th
International Conference on Compiler Construction (CC 2004),
LNCS volume 2985, pages 232–249, April 2004.
A principled way to run 32-bit codes on a 64-bit machine. Shows which
integer operators require sign extension, zero extension, or no extension.
The most interesting contribution is not the algorithm itself, but a simple
type system which classifies all arithmetic operations according to
how they may be handled using a machine word that is too wide. The work
generalizes to any pair of precisions.
- Declarative Composition of Stack
Frames (with Christian
Lindig).
In 13th
International Conference on Compiler Construction (CC 2004),
LNCS volume 2985, pages 298–312, April 2004.
A simple, powerful approach to stack-frame layout.
- Source-level Debugging for
Multiple Languages With Modest Programming Effort (with Sukyoung Ryu).
In 14th International Conference on
Compiler Construction (CC 2005), pages 10–26, April
2005.
How to add debugging support to a compiler while enabling the compiler
to maintain natural representations of its source language, a natural
ordering of its phases, and natural lifetimes of its internal data
structures.
- An Applicative Control-Flow Graph
Based on Huet's Zipper (with João Dias).
In ACM SIGPLAN Workshop on
ML, pages 101–122, September 2005.
A control-flow graph for doing classical imperative-style optimization
in your functional compiler.
- ML Module Mania: A Type-Safe,
Separately Compiled, Extensible Interpreter.
In ACM SIGPLAN Workshop on
ML, pages 172–202, September 2005.
A functional pearl that explains Lua-ML's extension mechanism:
separately compiled libraries. For modules hackers everywhere.
- Converting Intermediate Code
to Assembly Code Using Declarative Machine Descriptions (with João Dias).
In 15th International Conference
on Compiler Construction (CC 2006), LNCS volume 3923,
pages 217–231, March 2006.
A significant step toward our agenda of generating a compiler from
declarative machine descriptions.
- Teach Technical Writing in Two Hours
per Week, October 2006.
A handbook for teaching writing to students in science or engineering.
Accompanied by a student's edition.
- Automatically Generating Back
Ends for a Portable Assembly Language Using Declarative Machine
Descriptions (with João Dias).
Technical report, Harvard University, January 2008.
This somewhat hasty technical report presents the most exciting results
from João Dias's
doctoral dissertation. Persons actually wishing to understand the results
should ask Dias for a copy of the dissertation.
For a complete view, including older work, see my
publications list.
The ACM requires this disclaimer:
The documents contained in these pages are included
to ensure timely
dissemination of scholarly and technical work on a
non-commercial basis. Copyright and all rights therein
are maintained by the authors or by other copyright
holders, notwithstanding that they have offered their
works here electronically. It is understood that all
persons copying this information will adhere to the terms
and constraints invoked by each author's copyright.
These works may not be reposted without the explicit
permission of the copyright holder.
Teaching
In 2007–2008 I am on unpaid leave.
In the past, I have taught
CS 152 (Programming Languages),
CS 251 (Advanced Systems Programming),
CS 252r (Advanced Functional Programming),
CS 253r (Compiler Support for Run-Time Services),
and
CS 257 (Programming with Concurrency).
Resources for Students
As I am on leave, I am not holding office hours or supervising new students.
I have gathered material of interest to research
students, including resources for
writers,
how to give a talk, and
how to get
admitted to the Harvard PhD program.
Undergraduate research students might also be interested.
If you want details about our graduate programs you can consult
both an overview
and detailed
requirements.
I was program chair for ICFP '07.
After attending the Edinburgh Links
meeting in 2005, I felt compelled to share my
thoughts on the future of functional
programming.
I no longer maintain a
hot list; this is more of a random list.
An interest in personal productivity and pointers from
Benjamin Pierce
and
Phil Wadler
got me to Inbox Zero
on Wed 21 Feb 2007 at 6:00 PM.
After serious lossage caused by various alarums and excursions,
I recovered Zero at 6:30 PM on Mon 31 Dec 2007.
It was a pleasure to start the New Year with an empty inbox!
I'm a Bellcore alumnus
and a
long-standing
member
of the
Luxuriant Flowing Hair Club for Scientists,
and I have an Erdös
Number of 5.
Despite these scientific credentials,
I'm not ashamed to subscribe to a magazine with a centerfold.
Although it surprises some people, for over thirty years I have
been a
football
fan.
When it's not football season, I've been known to play Guild Wars.
I also have a rare autographed copy of
Ad Verbum.
In my copious free time, I enjoy overworking.
I eat at the
Harkness
Cafeteria (where Anita used to make excellent vegetarian food) and at the
Oxford Spa (where everyone makes excellent sandwiches).
I try to avoid P. J. Brown's deadly sins.
I'm married to a licensed
psychologist.
We do not have a Department of Computer Science at Harvard.
My professional home is in the
Harvard School of Engineering and
Applied Sciences.