Greg Morrisett

Allen B. Cutting Professor of Computer Science
School of Engineering and Applied Sciences
Harvard University

33 Oxford Street
151 Maxwell Dworkin Hall
Cambridge, MA 02138

Assistant: Susan Welby
Maxwell Dworkin 239
                         Office phone: +1 (617) 495-9526
Fax: +1 (617) 495-2498


Susan's phone: +1 (617) 496-7592

Photo: Greg Morrisett

C. Hritcu, M. Greenberg, B. Karle, B. C. Pierce, and G. Morrisett. All your IFCException are belong to us. In Proceedings of the 34th IEEE Symposium on Security and Privacy (Oakland), May 2013. To appear. [ bib | .pdf ]

A. P. Randles, D. Rand, C. Lee, G. Morrisett, J. Sircar, M. Nowak, and H. Pfister. Massively parallel model of extended memory use in evolutionary game dynamics. In 27th IEEE International Parallel and Distributed Processing Symposium, May 2013. To appear. [ bib ]

U. Dhawan, A. Kwon, E. Kadric, C. Hritcu, B. C. Pierce, J. M. Smith, G. Malecha, G. Morrisett, T. F. Knight, Jr., A. Sutherland, T. Hawkins, A. Zyxnfryx, D. Wittenberg, P. Trei, S. Ray, G. Sullivan, and A. DeHon. Hardware support for safety interlocks and introspection. In SASO Workshop on Adaptive Host and Network Security, Sept. 2012. [ bib | .pdf ]

G. Morrisett, G. Tan, J. Tassarotti, J.-B. Tristan, and E. Gan. Rocksalt: better, faster, stronger sfi for the x86. In Proceedings of the 33rd ACM SIGPLAN conference on Programming Language Design and Implementation, PLDI '12, pages 395-404, New York, NY, USA, 2012. ACM. [ bib | DOI | http ]

Software-based fault isolation (SFI), as used in Google's Native Client (NaCl), relies upon a conceptually simple machine-code analysis to enforce a security policy. But for complicated architectures such as the x86, it is all too easy to get the details of the analysis wrong. We have built a new checker that is smaller, faster, and has a much reduced trusted computing base when compared to Google's original analysis. The key to our approach is automatically generating the bulk of the analysis from a declarative description which we relate to a formal model of a subset of the x86 instruction architecture. The x86 model, developed in Coq, is of independent interest and should be usable for a wide range of machine-level verification tasks.

Keywords: domain-specific languages, software fault isolation

A. Dehon, B. Karel, B. Montagu, B. Pierce, J. Smith, T. Knight, S. Ray, G. Sullivan, G. Malecha, G. Morrisett, R. Pollack, R. Morisset, and O. Shivers. Preliminary design of the SAFE platform. In Proceedings of the 6th Workshop on Programming Languages and Operating Systems (PLOS 2011). ACM, Oct. 2011. [ bib ]

SAFE is a clean-slate design for a secure host architecture, coupling advances in programming languages, operating systems, and hardware, and incorporating formal methods at every step. The project is still at an early stage, but we have identified a set of fundamental architectural choices that we believe will work together to yield a high-assurance system. We sketch the current state of the design and discuss several of these choices.

B. Zeng, G. Tan, and G. Morrisett. Combining control-flow integrity and static analysis for efficient and validated data sandboxing. In 18th ACM Conference on Computer and Communications Security. ACM, Oct. 2011. [ bib ]

In many software attacks, inducing an illegal control-flow transfer in the target system is one common step. Control-Flow Integrity (CFI) protects a software system by enforcing a pre-determined control-flow graph. In addition to providing strong security, CFI enables static analysis on low-level code. This paper evaluates whether CFI-enabled static analysis can help build efficient and validated data sandboxing. Previous systems generally sandbox memory writes for integrity, but avoid protecting confidentiality due to the high overhead of sandboxing memory reads. To reduce overhead, we have implemented a series of optimizations that remove sandboxing instructions if they are proven unnecessary by static analysis. On top of CFI, our system adds only 2.7% runtime overhead on SPECint2000 for sandboxing memory writes and adds modest 19% for sandboxing both reads and writes. We have also built a principled data-sandboxing verifier based on range analysis. The verifier checks the safety of the results of the optimizer, which removes the need to trust the rewriter and optimizer. Our results show that the combination of CFI and static analysis has the potential of bringing down the cost of general inlined reference monitors, while maintaining strong security.

J.-B. Tristan, P. Govereau, and G. Morrisett. Evaluating value-graph translation validation for LLVM. In Proceedings of the ACM SIGPLAN Conference on Programming Design and Implementation (PLDI), New York, NY, USA, 2011. ACM. [ bib | .pdf ]

Translation validators are static analyzers that attempt to verify that program transformations preserve semantics. Normalizing translation validators do so by trying to match the value-graphs of an original function and it’s transformed counterpart. In this paper, we present the design of such a validator for LLVM’s intra-procedural optimizations, a design that does not require any instrumentation of the optimizer, nor any rewriting of the source code to compile, and needs to run only once to validate a pipeline of optimizations. We present the results of our preliminary experiments on a set of benchmarks that include GCC, a perl interpreter, sqlite3, and other C programs.

J. Siefers, G. Tan, and G. Morrisett. Robusta: Taming the native beast of the JVM. In 17th ACM Conference on Computer and Communications Security (CCS), New York, NY, USA, Nov. 2010. ACM. [ bib | .pdf ]

Java applications often need to incorporate native-code components for efficiency and for reusing legacy code. However, it is well known that the use of native code defeats Java’s security model. We describe the design and implementation of Robusta, a complete framework that provides safety and security to native code in Java applications. Starting from software-based fault isolation (SFI), Robusta isolates native code into a sandbox where dynamic linking/loading of libraries is supported and unsafe system modification and confidentiality violations are prevented. It also mediates na- tive system calls according to a security policy by connecting to Java’s security manager. Our prototype implementation of Robusta is based on Native Client and OpenJDK. Experiments in this prototype demonstrate Robusta is effective and efficient, with modest runtime overhead on a set of JNI benchmark programs. Robusta can be used to sandbox native libraries used in Java’s system classes to prevent attackers from exploiting bugs in the libraries. It can also enable trustworthy execution of mobile Java programs with native libraries. The design of Robusta should also be applicable when other type-safe languages (e.g., C#, Python) want to ensure safe interoperation with native libraries.

G. Malecha and G. Morrisett. Mechanized verification with sharing. In 7th International Colloquium on Theoretical Aspects of Computing, Sept. 2010. [ bib | .pdf ]

We consider software verification of imperative programs by theorem proving in higher-order separation logic. Of particular interest are the difficulties of encoding and reasoning about sharing and aliasing in pointer-based data structures. Both of these are difficulties for reasoning in separation logic because they rely, fundamentally, on non-separate heaps. We show how sharing can be achieved while preserving abstraction using mechanized reasoning about fractional permissions in Hoare type theory.

G. Mainland and G. Morrisett. Nikola: Embedding compiled GPU functions in Haskell. In Proceedings of the 2010 ACM SIGPLAN Symposium on Haskell (Haskell'10), New York, NY, USA, Sept. 2010. ACM. [ bib | .pdf ]

We describe Nikola, a first-order language of array computations embedded in Haskell that compiles to GPUs via CUDA using a new set of type-directed techniques to support re-usable computations. Nikola automatically handles a range of low-level details for Haskell programmers, such as marshaling data to/from the GPU, size inference for buffers, memory management, and automatic loop parallelization. Additionally, Nikola supports both compile-time and run-time code generation, making it possible for programmers to choose when and where to specialize embedded programs.

G. Malecha, G. Morrisett, A. Shinnar, and R. Wisnesky. Toward a verified relational database management system. In POPL '10: Proceedings of the 37th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages, pages 237-248, New York, NY, USA, 2010. ACM. [ bib | DOI | .pdf ]

We report on our experience implementing a lightweight, fully verified relational database management system (RDBMS). The functional specification of RDBMS behavior, RDBMS implementation, and proof that the implementation meets the specification are all written and verified in Coq. Our contributions include: (1) a complete specification of the relational algebra in Coq; (2) an efficient realization of that model (B+ trees) implemented with the Ynot extension to Coq; and (3) a set of simple query optimizations proven to respect both semantics and run-time cost. In addition to describing the design and implementation of these artifacts, we highlight the challenges we encountered formalizing them, including the choice of representation for finite relations of typed tuples and the challenges of reasoning about data structures with complex sharing. Our experience shows that though many challenges remain, building fully-verified systems software in Coq is within reach.

A. Chlipala, G. Malecha, G. Morrisett, A. Shinnar, and R. Wisnesky. Effective interactive proofs for higher-order imperative programs. In ICFP '09: Proceedings of the 14th ACM SIGPLAN International Conference on Functional Programming, September 2009. [ bib | .pdf ]

We present a new approach for constructing and verifying higher-order, imperative programs using the Coq proof assistant. We build on the past work on the Ynot system, which is based on Hoare Type Theory. That original system was a proof of concept, where every program verification was accomplished via laborious manual proofs, with much code devoted to uninteresting low-level details. In this paper, we present a re-implementation of Ynot which makes it possible to implement fully-verified, higher-order imperative programs with reasonable proof burden. At the same time, our new system is implemented entirely in Coq source files, showcasing the versatility of that proof assistant as a platform for research on language design and verification. Both versions of the system have been evaluated with case studies in the verification of imperative data structures, such as hash tables with higher-order iterators. The verification burden in our new system is reduced by at least an order of magnitude compared to the old system, by replacing manual proof with automation. The core of the automation is a simplification procedure for implications in higher-order separation logic, with hooks that allow programmers to add domain-specific simplification rules. We argue for the effectiveness of our infrastructure by verifying a number of data structures and a packrat parser, and we compare to similar efforts within other projects. Compared to competing approaches to data structure verification, our system includes much less code that must be trusted; namely, about a hundred lines of Coq code defining a program logic. All of our theorems and decision procedures have or build machine-checkable correctness proofs from first principles, removing opportunities for tool bugs to create faulty verifications.

R. Wisnesky, G. Malecha, and G. Morrisett. Certified web services in Ynot. In 5th International Workshop on Automated Specification and Verification of Web Systems, July 2009. [ bib | .pdf ]

In this paper we demonstrate that it is possible to implement certified web systems in a way not much different from writing Standard ML or Haskell code, including use of imperative features like pointers, files, and socket I/O. We present a web-based course gradebook application developed with Ynot, a Coq library for certified imperative programming. We add a dialog-based I/O system to Ynot, and we extend Ynot’s underlying Hoare logic with event traces to reason about I/O behavior. Expressive abstractions allow the modular certification of both high level specifications like privacy guarantees and low level properties like memory safety and correct parsing.

G. Morrisett. Technical perspective a compiler's story. Commun. ACM, 52(7):106-106, 2009. [ bib | DOI ]

A. Nanevski, P. Govereau, and G. Morrisett. Towards type-theoretic semantics for transactional concurrency. In ACM SIGPLAN Workshop on Types in Language Design and Implementation, January 2009. [ bib | .pdf ]

We propose a dependent type theory that integrates programming, specifications, and reasoning about higher-order concurrent programs with shared transactional memory. The design builds upon our previous work on Hoare Type Theory (HTT), which we extend with types that correspond to Hoare-style specifications for transactions. The types track shared and local state of the process separately, and enforce that shared state always satisfies a given invariant, except at specific critical sections which appear to execute atomically. Atomic sections may violate the invariant, but must restore it upon exit. HTT follows Separation Logic in providing tight specifications of space requirements. As a logic, we argue that HTT is sound and compositional. As a programming language, we define its operational semantics and show adequacy with respect to specifications.

G. Mainland, G. Morrisett, and M. Welsh. Flask: Staged functional programming for sensor networks. In ICFP '08: Proceedings of the 13th ACM SIGPLAN International Conference on Functional Programming, September 2008. [ bib | .pdf ]

Severely resource-constrained devices present a confounding challenge to the functional programmer: we are used to having powerful abstraction facilities at our fingertips, but how can we make use of these tools on a device with an 8- or 16-bit CPU and at most tens of kilobytes of RAM? Motivated by this challenge, we have developed Flask, a domain specific language embedded in Haskell that brings the power of functional programming to sensor networks, collections of highly resource-constrained devices. Flask consists of a staging mechanism that cleanly separates node-level code from the meta-language used to generate node-level code fragments; syntactic support for embedding standard sensor network code; a restricted subset of Haskell that runs on sensor networks and constrains program space and time consumption; a higher-level "data stream" combinator library for quickly constructing sensor network programs; and an extensible runtime that provides commonly-used services. We demonstrate Flask through several small code examples as well as a compiler that generates node-level code to execute a network-wide query specified in a SQL-like language. We show how using Flask ensures constraints on space and time behavior. Through microbenchmarks and measurements on physical hardware, we demonstrate that Flask produces programs that are efficient in terms of CPU and memory usage and that can run effectively on existing sensor network hardware.

A. Nanevski, G. Morrisett, A. Shinnar, P. Govereau, and L. Birkedal. Ynot: Dependent types for imperative programs. In ICFP '08: Proceedings of the 13th ACM SIGPLAN International Conference on Functional Programming, September 2008. [ bib | .pdf ]

We describe an axiomatic extension to the Coq proof assistant, that supports writing, reasoning about, and extracting higher-order, dependently-typed programs with side-effects. Coq already includes a powerful functional language that supports dependent types, but that language is limited to pure, total functions. The key contribution of our extension, which we call Ynot, is the added support for computations that may have effects such as non-termination, accessing a mutable store, and throwing/catching exceptions.

The axioms of Ynot form a small trusted computing base which has been formally justified in our previous work on Hoare Type Theory (HTT). We show how these axioms can be combined with the powerful type and abstraction mechanisms of Coq to build higher-level reasoning mechanisms which in turn can be used to build realistic, verified software components. To substantiate this claim, we describe here a representative series of modules that implement imperative finite maps, including support for a higher-order (effectful) iterator. The implementations range from simple (e.g., association lists) to complex (e.g., hash tables) but share a common interface which abstracts the implementation details and ensures that the modules properly implement the finite map abstraction.

R. L. Petersen, L. Birkedal, A. Nanevski, and G. Morrisett. A realizability model for impredicative Hoare type theory. In 17th European Symposium on Programming, ESOP 2008, volume 4960 of Lecture Notes in Computer Science, pages 337-352. Springer, 2008. [ bib | .pdf ]

We present a denotational model of impredicative Hoare Type Theory, a very expressive dependent type theory in which one can specify and reason about mutable abstract data types. The model ensures soundness of the extension of Hoare Type Theory with impredicative polymorphism; makes the connections to separation logic clear, and provides a basis for investigation of further sound extensions of the theory, in particular equations between computations and types.

R. R. Newton, L. D. Girod, M. B. Craig, S. R. Madden, and J. G. Morrisett. Design and evaluation of a compiler for embedded stream programs. In LCTES '08: Proceedings of the 2008 ACM SIGPLAN-SIGBED conference on Languages, compilers, and tools for embedded systems, pages 131-140, New York, NY, USA, 2008. ACM. [ bib | .pdf ]

Applications that combine live data streams with embedded, parallel, and distributed processing are becoming more commonplace. WaveScript is a domain-specific language that brings high-level, type-safe, garbage-collected programming to these domains. This is made possible by three primary implementation techniques, each of which leverages characteristics of the streaming domain. First, we employ a novel evaluation strategy that uses a combination of interpretation and reification to partially evaluate programs into stream dataflow graphs. Second, we use profile-driven compilation to enable many optimizations that are normally only available in the synchronous (rather than asynchronous) dataflow domain. Finally, we incorporate an extensible system for rewrite rules to capture algebraic properties in specific domains (such as signal processing).

We have used our language to build and deploy a sensornetwork for the acoustic localization of wild animals, in particular, the Yellow-Bellied marmot. We evaluate WaveScript's performance on this application, showing that it yields good performance on both embedded and desktop-class machines, including distributed execution and substantial parallel speedups. Our language allowed us to implement the application rapidly, while outperforming a previous C implementation by over 35%, using fewer than half the lines of code. We evaluate the contribution of our optimizations to this success.

A. Nanevski, P. Govereau, and G. Morrisett. Type-theoretic semantics for transactional concurrency. Technical Report TR-09-07, Harvard University, Cambridge, MA, USA, July 2007. [ bib | .pdf ]

We propose a dependent type theory that combines programming, speci cations and reasoning about higher-order concurrent programs with shared higher-order transactional memory. We build on our previous work on Hoare Type Theory (HTT), which is extended here with types that correspond to Hoare-style specifications for transactions. The new types have the form CMD{I}x:A{Q}, and classify concurrent programs that may execute in a shared state with invariant I, and local state precondition P. Upon termination, such programs return a result of type A, and local state changed according to the postcondition Q. Throughout the execution, shared state always satisfies the invariant I, except at specific critical sections which are executed atomically; that is, sequentially, without interference from other processes. Atomic sections may violate the invariant, but must restore it upon exit. We follow the idea of Separation Logic, and adopt "small footprint" specifications, whereby each process is associated with a precondition that tightly describes its state requirement. HTT is a program logic and a dependently typed programming language at the same time. To consider it as a logic, we prove that it is sound and compositional. To consider it as a programming language, we define its operational semantics, and show adequacy with respect to the specifications.

A. Nanevski, A. Ahmed, G. Morrisett, and L. Birkedal. Abstract predicates and mutable ADTs in Hoare type theory. In 16th European Symposium on Programming, ESOP 2007, volume 4421 of Lecture Notes in Computer Science, pages 189-204. Springer, 2007. [ bib | .pdf ]

Hoare Type Theory (HTT) combines a dependently typed, higher-order language with monadically-encapsulated, stateful computations. The type system incorporates pre- and post-conditions, in a fashion similar to Hoare and Separation Logic, so that programmers can modularly specify the requirements and effects of computations within types. This paper extends HTT with quantification over abstract predicates (i.e., higher-order logic), thus embedding into HTT the Extended Calculus of Constructions. When combined with the Hoare-like specifications, abstract predicates provide a powerful way to define and encapsulate the invariants of private state that may be shared by several functions, but is not accessible to their clients. We demonstrate this power by sketching a number of abstract data types that demand ownership of mutable memory, including an idealized custom memory manager.

G. Mainland, G. Morrisett, M. Welsh, and R. Newton. Sensor network programming with Flask. In SenSys '07: Proceedings of the 5th International Conference on Embedded networked sensor systems, pages 385-386, New York, NY, USA, 2007. ACM. [ bib ]

A great deal of recent work has investigated new programming abstractions and models for sensor networks. However, the complexity of such systems demands a great deal of effort to develop appropriate compilers and runtime platforms to achieve good performance. We will demonstrate Flask, a new programming platform for sensor networks that decouples the design of a high-level programming environment from the the low-level details of generating per-node code and an efficient runtime system.

R. Newton, G. Morrisett, and M. Welsh. The Regiment macroprogramming system. In IPSN '07: Proceedings of the 6th International Conference on Information processing in sensor networks, pages 489-498, New York, NY, USA, 2007. ACM. [ bib | .pdf ]

The development of high-level programming environments is essential if wireless sensor networks are to be accessible to non-experts. In this paper, we present the Regiment system, which consists of a high-level language for spatiotemporal macroprogramming, along with a compiler that translates global programs into node-level code. In Regiment, the programmer views the network as a set of spatially-distributed data streams. The programmer can manipulate sets of these streams that may be defined by topological or geographic relationships between nodes. Regiment provides a rich set of primitives for processing data on individual streams, manipulating regions, performing aggregation over a region, and triggering new computation within the network.

In this paper, we describe the design and implementation of the Regiment language and compiler. We describe the deglobalization process that compiles a network-wide representation of the program into a node-level, event-driven program. Deglobalization maps region operations onto associated spanning trees that establish region membership and permit efficient in-network aggregation. We evaluate Regiment in the context of a complex distributed application involving rapid detection of spatially-distributed events, such as wildfires or chemical plumes. Our results show that Regiment makes it possible to develop complex sensor network applications at a global level.

G. Tan and G. Morrisett. Ilea: inter-language analysis across Java and C. In OOPSLA '07: Proceedings of the 22nd annual ACM SIGPLAN conference on Object oriented programming systems and applications, pages 39-56, New York, NY, USA, 2007. ACM. [ bib | .pdf ]

Java bug finders perform static analysis to find implementation mistakes that can lead to exploits and failures; Java compilers perform static analysis for optimization.allIf Java programs contain foreign function calls to C libraries, however, static analysis is forced to make either optimistic or pessimistic assumptions about the foreign function calls, since models of the C libraries are typically not available.

We propose ILEA (stands for Inter-LanguagE Analysis), which is a framework that enables existing Java analyses to understand the behavior of C code. Our framework includes: (1) a novel specification language, which extends the Java Virtual Machine Language (JVML) with a few primitives that approximate the effects that the C code might have; (2) an automatic specification extractor, which builds models of the C code. Comparing to other possible specification languages, our language is expressive, yet facilitates construction of automatic specification extractors. Furthermore, because the specification language is based on the JVML, existing Java analyses can be easily migrated to utilize specifications in the language. We also demonstrate the utility of the specifications generated, by modifying an existing non-null analysis to identify null-related bugs in Java applications that contain C libraries. Our preliminary experiments identified dozens of null-related bugs.

A. Ahmed, M. Fluet, and G. Morrisett. L3: A linear language with locations. Fundam. Inf., 77(4):397-449, 2007. [ bib | .pdf ]

We present a simple, but expressive type system that supports strong updates - updating a memory cell to hold values of unrelated types at different points in time. Our formulation is based upon a standard linear lambda calculus and, as a result, enjoys a simple semantic interpretation for types that is closely related to models for spatial logics. The typing interpretation is strong enough that, in spite of the fact that our core programming language supports shared, mutable references and cyclic graphs, every well-typed program terminates.

We then consider extensions needed to model ML-style references, where the capability to access a reference cell is unrestricted, but strong updates are disallowed. Our extensions include a thaw primitive for re-gaining the capability to perform strong updates on unrestricted references. The thaw primitive is closely related to other mechanisms that support strong updates, such as CQUAL's restrict.

G. Mainland, M. Welsh, and G. Morrisett. Flask: A language for data-driven sensor network programs. Technical Report TR-13-06, Harvard University, May 2006. [ bib | .pdf ]

In this paper, we present Flask, a new programming language for sensor networks that is focused on providing an easy-to-use dataflow programming model. In Flask, programmers build applications by composing chains of operators into a dataflow graph that may reside on individual nodes or span multiple nodes in the network. To compose dataflow graphs across sensor nodes, Flask supports a lean, general-purpose communication abstraction, called Flows, that provides publish/subscribe semantics over efficient routing trees. At the heart of Flask is a programmatic wiring language, based on the functional language OCaml [13]. Flask’s wiring language allows dataflow graphs to be synthesized programmatically. The Flask wiring program is interpreted at compile time to generate a sensor node program in NesC, which is then compiled to a binary. Our design of Flask makes three main contributions. First, Flask allows the programmer to specify distributed dataflow applications in a high-level language while retaining the efficiency of compiled binaries and full access to TinyOS components. Second, Flask provides a unified framework for distributing dataflow applications across the network, allowing programmers to focus on application logic rather than details of routing code. Finally, Flask’s programmatic wiring language enables rich composition of dataflow operators, making it possible to develop higher-level programming models or languages directly in Flask. In this paper, we describe the design and implementation of the Flask language, its runtime system, the Flows communication interface, and a compiler that produces NesC code. We evaluate Flask through two motivating applications: a distributed detector of seismic activity (e.g., for studying earthquakes), and an implementation of the TinyDB query language built using Flask, showing that using Flask considerably reduces code complexity and memory size while achieving high runtime efficiency.

M. Fluet, G. Morrisett, and A. Ahmed. Linear regions are all you need. In ESOP'06: Proceedings of the Fifteenth European Symposium on Programming, pages 7-21. Springer-Verlag, March 2006. [ bib | .pdf ]

The type-and-effects system of the Tofte-Talpin region calculus makes it possible to safely reclaim objects without a garbage collector. However, it requires that regions have last-in-first-out (LIFO) lifetimes following the block structure of the language. We introduce rgnUL, a core calculus that is powerful enough to encode Tofte-Talpin-like languages, and that eliminates the LIFO restriction. The target language has an extremely simple, sub-structural type system. To prove the power of the language, we sketch how Tofte-Talpin-style regions, as well as the first-class dynamic regions and unique pointers of the Cyclone programming language can be encoded in rgnUL.

K. W. Hamlen, G. Morrisett, and F. B. Schneider. Certified in-lined reference monitoring on .NET. In PLAS '06: Proceedings of the 2006 workshop on Programming languages and analysis for security, pages 7-16, New York, NY, USA, 2006. ACM. [ bib | .pdf ]

MOBILE is an extension of the .NET Common Intermediate Language that supports certified In-Lined Reference Monitoring. Mobile programs have the useful property that if they are well-typed with respect to a declared security policy, then they are guaranteed not to violate that security policy when executed. Thus, when an In-Lined Reference Monitor (IRM) is expressed in Mobile, it can be certified by a simple type-checker to eliminate the need to trust the producer of the IRM.Security policies in Mobile are declarative, can involve unbounded collections of objects allocated at runtime, and can regard infinite-length histories of security events exhibited by those objects. The prototype Mobile implementation enforces properties expressed by finite-state security automata - one automaton for each security-relevant object - and can type-check Mobile programs in the presence of exceptions, finalizers, concurrency, and non-termination. Executing Mobile programs requires no change to existing .NET virtual machine implementations, since Mobile programs consist of normal managed CIL code with extra typing annotations stored in .NET attributes.

N. Swamy, M. Hicks, G. Morrisett, D. Grossman, and T. Jim. Safe manual memory management in Cyclone. Sci. Comput. Program., 62(2):122-144, 2006. [ bib ]

The goal of the Cyclone project is to investigate how to make a low-level C-like language safe. Our most difficult challenge has been providing programmers with control over memory management while retaining safety. This paper describes our experience trying to integrate and use effectively two previously-proposed, safe memory-management mechanisms: statically-scoped regions and tracked pointers. We found that these typing mechanisms can be combined to build alternative memory-management abstractions, such as reference counted objects and arenas with dynamic lifetimes, and thus provide a flexible basis. Our experience - porting C programs and device drivers, and building new applications for resource-constrained systems - confirms that experts can use these features to improve memory footprint and sometimes to improve throughput when used instead of, or in combination with, conservative garbage collection.

M. Fluet and G. Morrisett. Monadic regions. J. Funct. Program., 16(4-5):485-545, 2006. [ bib | .pdf ]

Region-based type systems provide programmer control over memory management without sacrificing type-safety. However, the type systems for region-based languages, such as the ML-Kit or Cyclone, are relatively complicated, and proving their soundness is non-trivial. This paper shows that the complication is in principle unnecessary. In particular, we show that plain old parametric polymorphism, as found in Haskell, is all that is needed. We substantiate this claim by giving a type- and meaning-preserving translation from a variation of the region calculus of Tofte and Talpin to a monadic variant of System F with region primitives whose types and operations are inspired by (and generalize) the ST monad of Launchbury and Peyton Jones.

A. Nanevski, G. Morrisett, and L. Birkedal. Polymorphism and separation in Hoare type theory. In ICFP '06: Proceedings of the eleventh ACM SIGPLAN International Conference on Functional programming, pages 62-73, New York, NY, USA, 2006. ACM. [ bib | .pdf ]

In previous work, we proposed a Hoare Type Theory (HTT) which combines effectful higher-order functions, dependent types and Hoare Logic specifications into a unified framework. However, the framework did not support polymorphism, and ailed to provide a modular treatment of state in specifications. In this paper, we address these shortcomings by showing that the addition of polymorphism alone is sufficient for capturing modular state specifications in the style of Separation Logic. Furthermore, we argue that polymorphism is an essential ingredient of the extension, as the treatment of higher-order functions requires operations not encodable via the spatial connectives of Separation Logic.

S. McCamant and G. Morrisett. Evaluating sfi for a cisc architecture. In Proceedings of the 15th conference on USENIX Security Symposium - Volume 15, USENIX-SS'06, Berkeley, CA, USA, 2006. USENIX Association. [ bib | http | .pdf ]

Executing untrusted code while preserving security requires that the code be prevented from modifying memory or executing instructions except as explicitly allowed. Software-based fault isolation (SFI) or "sandboxing" enforces such a policy by rewriting the untrusted code at the instruction level. However, the original sandboxing technique of Wahbe et al. is applicable only to RISC architectures, and most other previous work is either insecure, or has been not described in enough detail to give confidence in its security properties. We present a new sandboxing technique that can be applied to a CISC architecture like the IA-32, and whose application can be checked at load-time to minimize the TCB. We describe an implementation which provides a robust security guarantee and has low runtime overheads (an average of 21% on the SPECint2000 benchmarks). We evaluate the utility of the technique by applying it to untrusted decompression modules in an archive tool, and its safety by constructing a machine-checked proof that any program approved by the verification algorithm will respect the desired safety property.

K. W. Hamlen, G. Morrisett, and F. B. Schneider. Computability classes for enforcement mechanisms. ACM Trans. Program. Lang. Syst., 28(1):175-205, 2006. [ bib | .pdf ]

A precise characterization of those security policies enforceable by program rewriting is given. This also exposes and rectifies problems in prior work, yielding a better characterization of those security policies enforceable by execution monitors as well as a taxonomy of enforceable security policies. Some but not all classes can be identified with known classes from computational complexity theory.

S. McCamant and G. Morrisett. Efficient, verifiable sandboxing for a CISC architecture. Technical Report MIT-LCS-TR-988, MIT Laboratory for Computer Science, Cambridge, MA, USA, May 2005. [ bib | .pdf ]

Executing untrusted code while preserving security requires enforcement of memory and control-flow safety policies: untrusted code must be prevented from modifying memory or executing code except as explicitly allowed. Software-based fault isolation (SFI) or “sandboxing” enforces those policies by rewriting the untrusted code at the level of individual instructions. However, the original sandboxing technique of Wahbe et al. is applicable only to RISC architectures, and other previous work is either insecure, or has been not described in enough detail to give confidence in its security properties. We present a novel technique that allows sandboxing to be easily applied to a CISC architecture like the IA-32. The technique can be verified to have been applied at load time, so that neither the rewriting tool nor the compiler needs to be trusted. We describe a prototype implementation which provides a robust security guarantee, is scalable to programs of any size, and has low runtime overheads. Further, we give a machine-checked proof that any program approved by the verification algorithm is guaranteed to respect the desired safety property.

A. Nanevski and G. Morrisett. Dependent type theory of stateful higher-order functions. Technical Report TR-24-05, Harvard University, Cambridge, MA, USA, January 2005. [ bib | .pdf ]

In this paper we investigate a logic for reasoning about programs with higher-order functions and effectful features like non-termination and state with aliasing. We propose a dependent type theory HTT (short for Hoare Type Theory), where types serve as program specifications. In case of effectful programs, the type of Hoare triples {P}x:A{Q} specifies the precondition P, the type of the return result A, and the postcondition Q. By Curry-Howard isomorphism, a dependent type theory may be viewed as a functional programming language. From this perspective, the type of Hoare triples is a monad, and HTT is a monadic language, whose pure fragment consists of higher-order functions, while the effectful fragment is a full Turingcomplete imperative language with conditionals, loops, recursion and commands for stateful operations like allocation, lookup and mutation of location content.

D. Grossman, M. Hicks, T. Jim, and G. Morrisett. Cyclone: A type-safe dialect of C. C/C++ User's Journal, 23(1), January 2005. [ bib | .pdf ]

A. Ahmed, M. Fluet, and G. Morrisett. A step-indexed model of substructural state. In ICFP '05: Proceedings of the tenth ACM SIGPLAN International Conference on Functional programming, pages 78-91, New York, NY, USA, 2005. ACM. [ bib | .pdf ]

The concept of a "unique" object arises in many emerging programming languages such as Clean, CQual, Cyclone, TAL, and Vault. In each of these systems, unique objects make it possible to perform operations that would otherwise be prohibited (e.g., deallocating an object) or to ensure that some obligation will be met (e.g., an opened file will be closed). However, different languages provide different interpretations of "uniqueness" and have different rules regarding how unique objects interact with the rest of the language.Our goal is to establish a common model that supports each of these languages, by allowing us to encode and study the interactions of the different forms of uniqueness. The model we provide is based on a substructural variant of the polymorphic λ-calculus, augmented with four kinds of mutable references: unrestricted, relevant, affine, and linear. The language has a natural operational semantics that supports deallocation of references, strong (type-varying) updates, and storage of unique objects in shared references. We establish the strong soundness of the type system by constructing a novel, semantic interpretation of the types.

M. Abadi, G. Morrisett, and A. Sabelfeld. “language-based security”. J. Funct. Program., 15(2):129-129, 2005. [ bib ]

Concepts and techniques from modern programming languages have much to offer to the security of computer systems. This special issue is devoted to research on those concepts and techniques. Over 60 active researchers working in this area were invited to contribute. In particular, a number of the participants of the Dagstuhl Seminar on Language-Based Security were encouraged to submit. Submitted articles were reviewed by 3–4 referees. As a result of the reviewing process, five articles were selected for inclusion in the special issue.

M. Hicks, G. Morrisett, D. Grossman, and T. Jim. Experience with safe manual memory-management in Cyclone. In ISMM '04: Proceedings of the 4th international symposium on Memory management, pages 73-84, New York, NY, USA, 2004. ACM. [ bib | .pdf ]

The goal of the Cyclone project is to investigate type safety for low-level languages such as C. Our most difficult challenge has been providing programmers control over memory management while retaining type safety. This paper reports on our experience trying to integrate and effectively use two previously proposed, type-safe memory management mechanisms: statically-scoped regions and unique pointers. We found that these typing mechanisms can be combined to build alternative memory-management abstractions, such as reference counted objects and arenas with dynamic lifetimes, and thus provide a flexible basis. Our experience-porting C programs and building new applications for resource-constrained systems-confirms that experts can use these features to improve memory footprint and sometimes to improve throughput when used instead of, or in combination with, conservative garbage collection.

M. Fluet and G. Morrisett. Monadic regions. In ICFP '04: Proceedings of the ninth ACM SIGPLAN International Conference on Functional programming, pages 103-114, New York, NY, USA, 2004. ACM. [ bib | .pdf ]

Region-based type systems provide programmer control over memory management without sacrificing type-safety. However, the type systems for region-based languages, such as the ML-Kit or Cyclone, are relatively complicated, so proving their soundness is non-trivial. This paper shows that the complication is in principle unnecessary. In particular, we show that plain old parametric polymorphism, as found in Haskell, is all that is needed. We substantiate this claim by giving a type- and meaning-preserving translation from a region-based language based on core Cyclone to a monadic variant of System F with region primitives whose types and operations are inspired by (and generalize) the ST monad.

G. Morrisett. Invited talk: what's the future for proof-carrying code? In PPDP '04: Proceedings of the 6th ACM SIGPLAN International Conference on Principles and practice of declarative programming, pages 5-5, New York, NY, USA, 2004. ACM. [ bib ]

Proof-carrying code (PCC) was introduced by George Necula and Peter Lee in 1996. The principle is simple: we can eliminate the need to trust code by forcing the producer to give us a formal, machine-checkable proof that the code won't exhibit some "bad behavior" when executed. Thus, instead of having to perform a complicated (and thus un-trustworthy) analysis to determine whether or not code is bad, we can instead use a simple (and thus trustworthy) proof checker.The attraction to systems people was that the PCC framework placed no inherent limits on good code. As long as you could manufacture a proof that the code wasn't bad, then the code would be accepted. So, at least in principle, you wouldn't have to pay a performance penalty for safety. Over the past eight years, many researchers have worked to make PCC a reality. But I would argue that we are still very far from reaping the benefits that the framework promises. Good progress has been made in some areas, but there are a number of hard problems that remain. The hardest conceptual questions are (a) "What policies should we enforce?" and (b) "How does the code producer generate a proof?

J. Cheney and G. Morrisett. A linearly typed assembly language. Technical report, Cornell University, Ithaca, NY, USA, February 2003. [ bib | .pdf ]

Today’s type-safe low-level languages rely on garbage collection to recycle heap-allocated objects safely. We present LTAL, a safe, low-level, yet simple language that “stands on its own”: it guarantees safe execution within a fixed memory space, without relying on external run-time support. We demonstrate the expressiveness of LTAL by giving a type-preserving compiler for the functional core of ML. But this independence comes at a steep price: LTAL’s type system imposes a draconian discipline of linearity that ensures that memory can be reused safely, but prohibits any useful kind of sharing. We present the results of experiments with a prototype LTAL system that show just how high the price of linearity can be.

F. Smith, D. Grossman, G. Morrisett, L. Hornof, and T. Jim. Compiling for template-based run-time code generation. J. Funct. Program., 13(3):677-708, 2003. [ bib | .pdf ]

Cyclone is a type-safe programming language that provides explicit run-time code generation. The Cyclone compiler uses a template-based strategy for run-time code generation in which pre-compiled code fragments are stitched together at run time. This strategy keeps the cost of code generation low, but it requires that optimizations, such as register allocation and code motion, are applied to templates at compile time. This paper describes a principled approach to implementing such optimizations. In particular, we generalize standard flow-graph intermediate representations to support templates, define a mapping from (a subset of) Cyclone to this representation, and describe a dataflow-analysis framework that supports standard optimizations across template boundaries.

K. Crary, S. Weirich, and G. Morrisett. Intensional polymorphism in type-erasure semantics. J. Funct. Program., 12(6):567-600, 2002. [ bib ]

Intensional polymorphism, the ability to dispatch to different routines based on types at run time, enables a variety of advanced implementation techniques for polymorphic languages, including tag-free garbage collection, unboxed function arguments, polymorphic marshalling and attened data structures. To date, languages that support intensional polymorphism have required a type-passing (as opposed to type-erasure) interpretation where types are constructed and passed to polymorphic functions at run time. Unfortunately, type-passing suffers from a number of drawbacks: it requires duplication of run-time constructs at the term and type levels, it prevents abstraction, and it severely complicates polymorphic closure conversion. We present a type-theoretic framework that supports intensional polymorphism, but avoids many of the disadvantages of type passing. In our approach, run-time type information is represented by ordinary terms. This avoids the duplication problem, allows us to recover abstraction, and avoids complications with closure conversion. In addition, our type system provides another improvement in expressiveness; it allows unknown types to be refined in place, thereby avoiding certain beta-expansions required by other frameworks.

G. Morrisett, K. Crary, N. Glew, and D. Walker. Stack-based typed assembly language. J. Funct. Program., 12(1):43-88, 2002. [ bib ]

This paper presents STAL, a variant of Typed Assembly Language with constructs and types to support a limited form of stack allocation. As with other statically-typed low-level languages, the type system of STAL ensures that a wide class of errors cannot occur at run time, and therefore the language can be adapted for use in certifying compilers where security is a concern. Like the Java Virtual Machine Language (JVML), STAL supports stack allocation of local variables and procedure activation records, but unlike the JVML, STAL does not pre-suppose fixed notions of procedures, exceptions, or calling conventions. Rather, compiler writers can choose encodings for these high-level constructs using the more primitive RISC-like mechanisms of STAL. Consequently, some important optimizations that are impossible to perform within the JVML, such as tail call elimination or callee-saves registers, can be easily expressed within STAL.

D. Grossman, G. Morrisett, T. Jim, M. Hicks, Y. Wang, and J. Cheney. Region-based memory management in Cyclone. In PLDI '02: Proceedings of the ACM SIGPLAN 2002 Conference on Programming language design and implementation, pages 282-293, New York, NY, USA, 2002. ACM. [ bib | .pdf ]

Cyclone is a type-safe programming language derived from C. The primary design goal of Cyclone is to let programmers control data representation and memory management without sacrificing type-safety. In this paper, we focus on the region-based memory management of Cyclone and its static typing discipline. The design incorporates several advancements, including support for region subtyping and a coherent integration with stack allocation and a garbage collector. To support separate compilation, Cyclone requires programmers to write some explicit region annotations, but a combination of default annotations, local type inference, and a novel treatment of region effects reduces this burden. As a result, we integrate C idioms in a region-based framework. In our experience, porting legacy C to Cyclone has required altering about 8% of the code; of the changes, only 6% (of the 8%) were region annotations.

G. Morrisett. Analysis issues for cyclone. In PASTE '02: Proceedings of the 2002 ACM SIGPLAN-SIGSOFT workshop on Program analysis for software tools and engineering, pages 26-26, New York, NY, USA, 2002. ACM. [ bib ]

Cyclone is an experimental, type-safe programming language based upon the syntax, semantics, and spirit of C. The primary goal of the language is to provide a type-safe environment that is close enough to C in both appearance and functionality, that systems programmers will find it attractive and useful.The most challenging aspect of the design is capturing the spirit of C without compromising type-safety. In particular, systems programmers expect to have good control over data representations, memory management, and performance. Yet, these features are usually absent from high-level, type-safe languages (e.g., Java). Another challenge is validating a sufficiently wide set of idioms that are in fact type-safe, but which conventional type systems reject.To address these issues, we have used a novel combination of typing features in conjunction with some interesting inference and dataflow techniques. The most novel typing feature is the support for region-based memory management which was summarized in an earlier paper. However, this paper did not discuss the inference techniques we use to validate the regions and effects.In this talk, I will briefly summarize the Cyclone type system and then focus on the analysis issues that arise in its implementation, including (a) kind and type inference, (b) region and effect inference, and (c) dataflow analysis for validating initialization, array subscripts, and linear pointers.

J. G. Morrisett. Type checking systems code. In ESOP '02: Proceedings of the 11th European Symposium on Programming Languages and Systems, pages 1-5, London, UK, 2002. Springer-Verlag. [ bib | .pdf ]

Our critical computing systems are coded in low-level, typeunsafe languages such as C, and it is unlikely that they will be re-coded in a high-level, type-safe language such as Java. This invited talk discusses some approaches that show promise in achieving type safety for legacy C code.

T. Jim, J. G. Morrisett, D. Grossman, M. W. Hicks, J. Cheney, and Y. Wang. Cyclone: A safe dialect of C. In ATEC '02: Proceedings of the General Track of the annual conference on USENIX Annual Technical Conference, pages 275-288, Berkeley, CA, USA, 2002. USENIX Association. [ bib | .pdf ]

Cyclone is a safe dialect of C. It has been designed from the ground up to prevent the buffer overflows, format string attacks, and memory management errors that are common in C programs, while retaining C’s syntax and semantics. This paper examines safety violations enabled by C’s design, and shows how Cyclone avoids them, without giving up C’s hallmark control over low-level details such as data representation and memory management.

D. Walker and J. G. Morrisett. Alias types for recursive data structures. In TIC '00: Selected papers from the Third International Workshop on Types in Compilation, pages 177-206, London, UK, 2001. Springer-Verlag. [ bib | .pdf ]

Linear type systems permit programmers to deallocate or explicitly recycle memory, but they are severly restricted by the fact that they admit no aliasing. This paper describes a pseudo-linear type system that allows a degree of alias- ing and memory reuse as well as the ability to de ne com- plex recursive data structures. Our type system can encode conventional linear data structures such as linear lists and trees as well as more sophisticated data structures including cyclic and doubly-linked lists and trees. In the latter cases, our type system is expressive enough to represent pointer aliasing and yet safely permit destructive operations such as object deallocation. We demonstrate the flexibility of our type system by encoding two common compiler optimiza- tions: destination-passing style and Deutsch-Schorr-Waite or "link-reversal" traversal algorithms.

F. B. Schneider, J. G. Morrisett, and R. Harper. A language-based approach to security. In Informatics - 10 Years Back. 10 Years Ahead., pages 86-101, London, UK, 2001. Springer-Verlag. [ bib | .ps ]

Language-based security leverages program analysis and pro- gram rewriting to enforce security policies. The approach promises effi- cient enforcement of fine-grained access control policies and depends on a trusted computing base of only modest size. This paper surveys progress and prospects for the area, giving overviews of in-lined reference moni- tors, certifying compilers, and advances in type theory.

D. Grossman and J. G. Morrisett. Scalable certification for typed assembly language. In TIC '00: Selected papers from the Third International Workshop on Types in Compilation, pages 117-146, London, UK, 2001. Springer-Verlag. [ bib | .pdf ]

Linear type systems permit programmers to deallocate or explicitly recycle memory, but they are severely restricted by the fact that they admit no aliasing. This paper describes a pseudo-linear type system that allows a degree of aliasing and memory reuse as well as the ability to define complex recursive data structures. Our type system can encode conventional linear data structures such as linear lists and trees as well as more sophisticated data structures including cyclic and doubly-linked lists and trees. In the latter cases, our type system is expressive enough to represent pointer aliasing and yet safely permit destructive operations such as object deallocation. We demonstrate the flexibility of our type system by encoding two common compiler optimizations: destination-passing style and Deutsch-Schorr-Waite or "link-reversal” traversal algorithms.

D. Grossman, G. Morrisett, T. Jim, M. Hicks, Y. Wang, and J. Cheney. Formal type soundness for Cyclone's region system. Technical report, Cornell University, Ithaca, NY, USA, 2001. [ bib ]

Cyclone is a polymorphic, type-safe programming language derived from C. The primary design goals of Cyclone are to let programmers control data representations and memory management without sacrificing type-safety. In this paper, we focus on the region-based memory management of Cyclone and its static typing discipline. The design incorporates several advancements, including support for region subtyping and a coherent integration with stack allocation and a garbage collector. To support separate compilation, Cyclone requires programmers to write some explicit region annotations, but uses a combination of default annotations, local type inference, and a novel treatment of region effects to reduce this burden. As a result, we integrate C idioms in a region-based framework. In our experience, porting legacy C to Cyclone has required altering about 8% of the code; of the changes, only 6% (of the 8%) were region annotations. This technical report is really two documents in one: The first part is a paper submitted for publication in November, 2001. The second part is the full formal language and type-safety proof mentioned briefly in the first part. If you have already read a version of, “Region-Based Memory Management in Cyclone”, then you should proceed directly to Section 9.

D. Grossman, G. Morrisett, T. Jim, M. Hicks, Y. Wang, and J. Cheney. Cyclone user's manual, version 0.1.3. Technical report, Cornell University, Ithaca, NY, USA, 2001. [ bib ]

D. Walker, K. Crary, and G. Morrisett. Typed memory management via static capabilities. ACM Trans. Program. Lang. Syst., 22(4):701-771, 2000. [ bib ]

Region-based memory management is an alternative to standard tracing garbage collection that makes operation such as memory deallocation explicit but verifiably safe. In this article, we present a new compiler intermediate language, called the Capability Language (CL), that supports region-based memory management and enjoys a provably safe type systems. Unlike previous region-based type system, region lifetimes need not be lexically scoped, and yet the language may be checked for safety without complex analyses. Therefore, our type system may be deployed in settings such as extensible operating systems where both the performance and safety of untrusted code is important. The central novelty of the language is the use of static capabilities to specify the permissibility of various operations, such as memory access and deallocation. In order to ensure capabilities are relinquished properly, the type system tracks aliasing information using a form of bounded quantification. Moreover, unlike previous work on region-based type systems, the proof of soundness of our type system is relatively simple, employing only standard syntactic techniques. In order to show how our language may be used in practice, we show how to translate a variant of Tofte and Talpin's high-level type-and-effects system for region-based memory management into our language. When combined with known region inference algorithms, this translation provides a way to compile source-level languages to CL.

F. Smith, D. Walker, and J. G. Morrisett. Alias types. In ESOP '00: Proceedings of the 9th European Symposium on Programming Languages and Systems, pages 366-381, London, UK, 2000. Springer-Verlag. [ bib | .pdf ]

Linear type systems allow destructive operations such as object deallocation and imperative updates of functional data structures. These operations and others, such as the ability to reuse memory at different types, are essential in low-level typed languages. However, traditional linear type systems are too restrictive for use in low-level code where it is necessary to exploit pointer aliasing. We present a new typed language that allows functions to specify the shape of the store that they expect and to track the flow of pointers through a computation. Our type system is expressive enough to represent pointer aliasing and yet safely permit destructive operations.

D. Grossman, G. Morrisett, and S. Zdancewic. Syntactic type abstraction. ACM Trans. Program. Lang. Syst., 22(6):1037-1080, 2000. [ bib | .pdf ]

Software developers often structure programs in such a way that different pieces of code constitute distinct principals. Types help define the protocol by which these principals interact. In particular, abstract types allow a principal to make strong assumptions about how well-typed clients use the facilities that it provides. We show how the notions of principals and type abstraction can be formalized within a language. Different principals can know the implementation of different abstract types. We use additional syntax to track the flow of values with abstract types during the evaluation of a program and demonstrate how this framework supports syntactic proofs (in the sytle of subject reduction) for type-abstraction properties. Such properties have traditionally required semantic arguments; using syntax aboids the need to build a model and recursive typesfor the language. We present various typed lambda calculi with principals, including versions that have mutable state and recursive types.

G. McGraw and G. Morrisett. Attacking malicious code: A report to the infosec research council. IEEE Softw., 17(5):33-41, 2000. [ bib ]

The accelerating trends of interconnectedness, complexity, and extensibility are aggravating the already-serious threat posed by malicious code. To combat malicious code, these authors argue for creating sound policy about software behavior and enforcing that policy through technological means.

D. Grossman and G. Morrisett. Scalable certification of native code: Experience from compiling to TALx86. Technical report, Cornell University, Ithaca, NY, USA, 2000. [ bib ]

Certifying compilation allows a compiler to produce annotations that prove that target code abides by a specified safety policy. An independent verifier can check the code without needing to trust the compiler. For such a system to be generally useful, the safety policy should be expressive enough to allow different compilers to effectively produce certifiable code. In this work, we use our experience in writing a certifying compiler to suggest general design principles that should allow concise yet expressive certificates. As an extended example, we present our compiler”s translation of the control flow of Popcorn, a high-level language with function pointers and exception handlers, to TALx86, a typed assembly language with registers, a stack, memory, and code blocks. This example motivates techniques for controlling certificate size and verification time. We quantify the effectiveness of techniques for reducing the overhead of certifying compilation by measuring the effects their use has on a real Popcorn application, the compiler itself. The selective use of these techniques, which include common-subexpression elimination of types, higher-order type abbreviations, and selective re-verification, can change certificate size and verification time by well over an order of magnitude. We consider this report to be the first quantitative study on the practicality of certifying a real program using a type system not specifically designed for the compiler or source language.

F. Smith, D. Grossman, G. Morrisett, L. Hornof, and T. Jim. Compiling for runtime code generation (extended version). Technical report, Cornell University, Ithaca, NY, USA, 2000. [ bib ]

Cyclone is a programming language that provides explicit support for dynamic specialization based on runtime code generation. To generate specialized code quickly, our Cyclone compiler uses a template based strategy in which pre-compiled code fragments are stitched together at runtime. To achieve good performance, the pre-compiled fragments must be optimized. This paper describes a principled approach to achieving such optimizations. In particular, we generalize standard flow-graph intermediate representations to support templates, define a formal mapping from (a subset of) Cyclone to this representation, and describe a data-flow analysis framework that supports standard optimizations. This extended version contains two mappings to the intermediate representation, a less formal one that emphasizes the novelties of our translation strategy and a purely functional one that is better suited to formal reasoning.

D. Walker and G. Morrisett. Alias types for recursive data structures (extended version). Technical report, Cornell University, Ithaca, NY, USA, 2000. [ bib ]

Linear type systems permit programmers to deallocate or explicitly recycle memory, but they are severly restricted by the fact that they admit no aliasing. This paper describes a pseudo-linear type system that allows a degree of aliasing and memory reuse as well as the ability to define complex recursive data structures. Our type system can encode conventional linear data structures such as linear lists and trees as well as more sophisticated data structures including cyclic and doubly-linked lists and trees. In the latter cases, our type system is expressive enough to represent pointer aliasing and yet safely permit destructive operations such as object deallocation. We demonstrate the flexibility of our type system by encoding two common compiler optimizations: destination-passing style and Deutsch-Schorr-Waite or “link-reversal”” traversal algorithms.

D. Walker, K. Crary, and G. Morrisett. Typed memory management in a calculus of capabilities. Technical report, Cornell University, Ithaca, NY, USA, 2000. [ bib ]

Region-based memory management is an alternative to standard tracing garbage collection that makes potentially dangerous operations such as memory deallocation explicit but verifiably safe. In this article, we present a new compiler intermediate language, called the Capability Calculus, that supports region-based memory management and enjoys a provably safe type system. Unlike previous region-based type systems, region lifetimes need not be lexically scoped and yet the language may be checked for safety without complex analyses. Therefore, our type system may be deployed in settings such as extensible operating systems where both the performance and safety of untrusted code is important. The central novelty of the language is the use of static capabilities to specify the permissibility of various operations, such as memory access and deallocation. In order to ensure capabilities are relinquished properly, the type system tracks aliasing information using a form of bounded quantification. Moreover, unlike previous work on region-based type systems, the proof of soundness of our type system is relatively simple, employing only standard syntactic techniques. In order to show our language may be used in practice, we show how to translate a variant of Tofte and Talpin”s high-level type-and-effects system for region-based memory management into our language. When combined with known region inference algorithms, this translation provides a way to compile source-level languages to the Capability Calculus.

F. Schneider, G. Morrisett, and R. Harper. A language-based approach to security. Technical report, Cornell University, Ithaca, NY, USA, 2000. [ bib ]

Language-based security leverages program analysis and program rewriting to enforce security policies. The approach promises efficient enforcement of fine-grained access control policies and depends on a trusted computing base of only modest size. This paper surveys progress and prospects for the area, giving overviews of in-lined reference monitors, certifying compilers, and advances in type theory.

G. Morrisett, K. Crary, N. Glew, D. Grossman, R. Samuels, F. Smith, D. Walker, S. Weirich, and S. Zdancewic. Talx86: A realistic typed assembly language. In In Second Workshop on Compiler Support for System Software, pages 25-35, 1999. [ bib | .pdf ]

The goal of typed assembly language (TAL) is to provide a low-level, statically typed target language that is better suited than Java bytecodes for supporting a wide variety of source languages and a number of important optimizations. In previous work, we formalized idealized versions of TAL and proved important safety properties about them. In this paper, we present our progress in defining and implementing a realistic typed assembly language called TALx86. The TALx86 instructions comprise a relatively complete fragment of the Intel IA32 (32-bit 80x86 flat model) assembly language and are thus executable on processors such as the Intel Pentium. The type system for the language incorporates a number of advanced features necessary for safely compiling large programs to good code.

To motivate the design of the type system, we demonstrate how various high-level language features are compiled to TALx86. For this purpose, we present a type-safe C-like language called Popcorn.

G. Morrisett, D. Walker, K. Crary, and N. Glew. From System F to typed assembly language. ACM Trans. Program. Lang. Syst., 21(3):527-568, 1999. [ bib | .pdf ]

We motivate the design of typed assembly language (TAL) and present a type-preserving ttranslation from Systemn F to TAL. The typed assembly language we pressent is based on a conventional RISC assembly language, but its static type sytem provides support for enforcing high-level language abstratctions, such as closures, tuples, and user-defined abstract data types. The type system ensures that well-typed programs cannot violatet these abstractionsl In addition, the typing constructs admit many low-level compiler optimiztaions. Our translation to TAL is specified as a sequence of type-preserving transformations, including CPS and closure conversion phases; type-correct source programs are mapped to type-correct assembly language. A key contribution is an approach to polymorphic closure conversion that is considerably simpler than previous work. The compiler and typed assembly lanugage provide a fully automatic way to produce certified code, suitable for use in systems where unstrusted and potentially malicious code must be checked for safety before execution.

K. Crary, D. Walker, and G. Morrisett. Typed memory management in a calculus of capabilities. In POPL '99: Proceedings of the 26th ACM SIGPLAN-SIGACT symposium on Principles of programming languages, pages 262-275, New York, NY, USA, 1999. ACM. [ bib | .pdf ]

An increasing number of systems rely on programming language technology to ensure safety and security of low-level code. Unfortunately, these systems typically rely on a complex, trusted garbage collector. Region-based type systems provide an alternative to garbage collection by making memory management explicit but verifiably safe. However, it has not been clear how to use regions in low-level, type-safe code.

We present a compiler intermediate language, called the Capability Calculus, that supports region-based memory management, enjoys a provably safe type system, and is straightforward to compile to a typed assembly language. Source languages may be compiled to our language using known region inference algorithms. Furthermore, region lifetimes need not be lexically scoped in our language, yet the language may be checked for safety without complex analyses. Finally, our soundness proof is relatively simple, employing only standard techniques.

The central novelty is the use of static capabilities to specify the permissibility of various operations, such as memory access and deallocation. In order to ensure capabilities are relinquished properly, the type system tracks aliasing information using a form of bounded quantification.

K. Crary and J. G. Morrisett. Type structure for low-level programming languages. In ICAL '99: Proceedings of the 26th International Colloquium on Automata, Languages and Programming, pages 40-54, London, UK, 1999. Springer-Verlag. [ bib ]

N. Glew and G. Morrisett. Type-safe linking and modular assembly language. In POPL '99: Proceedings of the 26th ACM SIGPLAN-SIGACT symposium on Principles of programming languages, pages 250-261, New York, NY, USA, 1999. ACM. [ bib | .pdf ]

Linking is a low-level task that is usually vaguely specified, if at all, by language definitions. However, the security of web browsers and other extensible systems depends crucially upon a set of checks that must be performed at link time. Building upon the simple, but elegant ideas of Cardelli, and module constructs from high-level languages, we present a formal model of typed object files and a set of inference rules that are sufficient to guarantee that type safety is preserved by the linking process.

Whereas Cardelli's link calculus is built on top of the simply-typed lambda calculus, our object files are based upon typed assembly language so that we may model important low-level implementation issues. Furthermore, unlike Cardelli, we provide support for abstract types and higher-order type constructors - features critical for building extensible systems or modern programming languages such as ML.

S. Zdancewic, D. Grossman, and G. Morrisett. Principals in programming languages: a syntactic proof technique. In ICFP '99: Proceedings of the fourth ACM SIGPLAN International Conference on Functional programming, pages 197-207, New York, NY, USA, 1999. ACM. [ bib | .pdf ]

Programs are often structured around the idea that different pieces of code comprise distinct principals, each with a view of its environment. Typical examples include the modules of a large program, a host and its clients, or a collection of interactive agents.In this paper, we formalize this notion of principal in the programming language itself. The result is a language in which intuitive statements such as, "the client must call open to obtain a file handle," can be phrased and proven formally.We add principals to variants of the simply-typed λ-calculus and show how we can track the code corresponding to each principal throughout evaluation. This multiagent calculus yields syntactic proofs of some type abstraction properties that traditionally require semantic arguments.

F. Smith, D. Walker, and G. Morrisett. Alias types. Technical report, Cornell University, Ithaca, NY, USA, 1999. [ bib ]

G. Morrisett and R. Harper. Typed closure conversion for recursively-defined functions (extended abstract). In C. Talcott, editor, Higher-Order Techniques in Operational Semantics, volume 10 of Electronic Notes in Theoretical Computer Science. Elsevier, 1998. [ bib | .ps ]

K. Crary, S. Weirich, and G. Morrisett. Intensional polymorphism in type-erasure semantics. In ICFP '98: Proceedings of the third ACM SIGPLAN International Conference on Functional programming, pages 301-312, New York, NY, USA, 1998. ACM. [ bib | .pdf ]

Intensional polymorphism, the ability to dispatch to different routines based on types at run time, enables a variety of advanced implementation techniques for polymorphic languages, including tag-free garbage collection, unboxed function arguments, polymorphic marshalling, and flattened data structures. To date, languages that support intensional polymorphism have required a type-passing (as opposed to type-erasure) interpretation where types are constructed and passed to polymorphic functions at run time. Unfortunately, type-passing suffers from a number of drawbacks: it requires duplication of constructs at the term and type levels, it prevents abstraction, and it severely complicates polymorphic closure conversion.We present a type-theoretic framework that supports intensional polymorphism, but avoids many of the disadvantages of type passing. In our approach, run-time type information is represented by ordinary terms. This avoids the duplication problem, allows us to recover abstraction, and avoids complications with closure conversion. In addition, our type system provides another improvement in expressiveness; it allows unknown types to be refined in place thereby avoiding certain beta-expansions required by other frameworks.

G. Morrisett, D. Walker, K. Crary, and N. Glew. From System F to typed assembly language. In POPL '98: Proceedings of the 25th ACM SIGPLAN-SIGACT symposium on Principles of programming languages, pages 85-97, New York, NY, USA, 1998. ACM. [ bib | .pdf ]

We motivate the design of a statically typed assembly language (TAL) and present a type-preserving translation from System F to TAL. The TAL we present is based on a conventional RISC assembly language, but its static type system provides support for enforcing high-level language abstractions, such as closures, tuples, and objects, as well as user-defined abstract data types. The type system ensures that well-typed programs cannot violate these abstractions. In addition, the typing constructs place almost no restrictions on low-level optimizations such as register allocation, instruction selection, or instruction scheduling. Our translation to TAL is specified as a sequence of type-preserving transformations, including CPS and closure conversion phases; type-correct source programs are mapped to type-correct assembly language. A key contribution is an approach to polymorphic closure conversion that is considerably simpler than previous work. The compiler and typed assembly language provide a fully automatic way to produce proof carrying code, suitable for use in systems where untrusted and potentially malicious code must be checked for safety before execution.

F. Smith and G. Morrisett. Comparing mostly-copying and mark-sweep conservative collection. In ISMM '98: Proceedings of the 1st international symposium on Memory management, pages 68-78, New York, NY, USA, 1998. ACM. [ bib | .pdf ]

Many high-level language compilers generate C code and then invoke a C compiler for code generation. To date, most, of these compilers link the resulting code against a conservative mark-sweep garbage collector in order to reclaim unused memory. We introduce a new collector, MCC, based on an extension of mostly-copying collection.We analyze the various design decisions made in MCC and provide a performance comparison to the most widely used conservative mark-sweep collector (the Boehm-Demers-Weiser collector). Our results show that a good mostly-copying collector can outperform a mature highly-optimized mark-sweep collector when physical memory is large relative to the live data. A surprising result of our analysis is that cache behavior can have a greater impact on overall performance than either collector time or allocation time.

J. G. Morrisett, K. Crary, N. Glew, and D. Walker. Stack-based typed assembly language. In TIC '98: Proceedings of the Second International Workshop on Types in Compilation, pages 28-52, London, UK, 1998. Springer-Verlag. [ bib | .pdf ]

In previous work, we presented a Typed Assembly Language (TAL). TAL is sufficiently expressive to serve as a target language for compilers of high-level languages such as ML. This work assumed such a compiler would perform a continuation-passing style transform and eliminate the control stack by heap-allocating activation records. However, most compilers are based on stack allocation. This paper presents STAL, an extension of TAL with stack constructs and stack types to support the stack allocation style. We show that STAL is sufficiently expressive to support languages such as Java, Pascal, and ML; constructs such as exceptions and displays; and optimizations such as tail call elimination and callee-saves registers. This paper also formalizes the typing connection between CPS-based compilation and stack-based compilation and illustrates how STAL can formally model calling conventions by specifying them as formal translations of source function types to STAL types.

K. Crary, S. Weirich, and G. Morrisett. Intensional polymorphism in type-erasure semantics. Technical report, Cornell University, Ithaca, NY, USA, 1998. [ bib ]

Intensional polymorphism, the ability to dispatch to different routines based on types at run time, enables a variety of advanced implementation techniques for polymorphic languages, including tag-free garbage collection, unboxed function arguments, polymorphic marshalling, and flattened data structures. To date, languages that support intensional polymorphism have required a type-passing (as opposed to type-erasure) interpretation where types are constructed and passed to polymorphic functions at run time. Unfortunately, type-passing suffers from a number of drawbacks: it requires duplication of constructs at the term and type levels, it prevents abstraction, and it severely complicates polymorphic closure conversion. We present a type-theoretic framework that supports intensional polymorphism, but avoids many of the disadvantages of type passing. In our approach, run-time type information is represented by ordinary terms. This avoids the duplication problem, allows us to recover abstraction, and avoids complications with closure conversion. In addition, our type system provides another improvement in expressiveness; it allows unknown types to be refined in place thereby avoiding certain beta-expansions required by other frameworks.

G. Morrisett and R. Harper. Semantics of memory management for polymorphic languages. In Higher-Order Operational Techniques in Semantics, Electronic Notes in Theoretical Computer Science, pages 175-226, New York, NY, USA, 1998. Cambridge University Press. [ bib | .ps ]

We present a static and dynamic semantics for an abstract machine that evaluates expressions of a polymorphic programming language. Unlike traditional semantics, our abstract machine exposes many important issues of memory management, such as value sharing and control representation. We prove the soundness of the static semantics with respect to the dynamic semantics using traditional techniques. We then show how these same techniques may be used to establish the soundness of various memory management strategies, including type-based, tag-free garbage collection; tail-call elimination; and environment strengthening.

A. Basu, M. Hayden, G. Morrisett, and T. Eicken. A language-based approach to protocol construction. In In Proc. of the ACM SIGPLAN Workshop on Domain Specific Languages (WDSL, pages 87-99, 1997. [ bib | .pdf ]

User-level network architectures that provide applications with direct access to network hardware have become popularwith the emergence of high-speed networking technologies such as ATM and Fast Ethernet. However, experience with user-level network architectures such as U-Net [vEBBV95] has shown that building correct and efficient protocols on such architectures is a challenge. To address this problem, Promela++, an extension of the Promela protocol validation language has been developed. Promela++ allows automatic verification of protocol correctness against programmer specified safety requirements. Furthermore, Promela++ can be compiled to efficient C code. Thus far, the results are encouraging: the C code produced by the Promela++ compiler shows performance comparable to hand-coded versions of a relatively simple protocol.

F. Smith and G. Morrisett. Mostly-copying collection: A viable alternative to conservative mark-sweep. Technical report, Cornell University, Ithaca, NY, USA, 1997. [ bib ]

Many high-level language compilers generate C code and then invoke a C compiler to do code generation, register allocation, stack management, and low-level optimization. To date, most of these compilers link the resulting code against a conservative mark-sweep garbage collector in order to reclaim unused memory. We introduce a new collector, MCC, based on mostly-copying collection, and characterize the conditions that favor such a collector over a mark-sweep collector. In particular we demonstrate that mostly-copying collection outperforms conservative mark-sweep under the same conditions that accurate copying collection outperforms accurate mark-sweep: Specifically, MCC meets or exceeds the performance of a mature mark-sweep collector when allocation rates are high, and physical memory is large relative to the live data.

G. Morrisett, D. Walker, K. Crary, and N. Glew. From System F to typed assembly language (extended version). Technical report, Cornell University, Ithaca, NY, USA, 1997. [ bib ]

We motivate the design of a statically typed assembly language (TAL) and present a type-preserving translation from System F to TAL. The TAL we present is based on a conventional RISC assembly language, but its static type system provides support for enforcing high-level language abstractions, such as closures, tuples, and objects, as well as user-defined abstract data types. The type system ensures that well-typed programs cannot violate these abstractions. In addition, the typing constructs place almost no restrictions on low-level optimizations such as register allocation, instruction selection, or instruction scheduling. Our translation to TAL is specified as a sequence of type-preserving transformations, including CPS and closure conversion phases; type-correct source programs are mapped to type-correct assembly language. A key contribution is an approach to polymorphic closure conversion that is considerably simpler than previous work. The compiler and typed assembly language provide a fully automatic way to produce proof carrying code, suitable for use in systems where untrusted and potentially malicious code must be checked for safety before execution.

G. Morrisett, D. Tarditi, P. Cheng, C. Stone, R. Harper, and P. Lee. The til/ml compiler: Performance and safety through types. In In Workshop on Compiler Support for Systems Software, Tucscon, AZ, USA, 1996. [ bib | .ps ]

Systems code requires both high performance and reliability. Usually, these two goals are at odds with each other. For example, to prevent kernel data structures from being over-written or read, either accidentally or maliciously, conventional systems use hardware-enforced protection or software fault isolation (SFI) [18]. Unfortunately, both of these techniques exact a cost at run time: Hardware protection requires expensive context switches and data copying to communicate with the kernel or other processes, whereas SFI requires run-time checks for loads, stores and jumps. Furthermore, these protection mechanisms only guarantee "read/write " safety and are often too weak to guarantee the integrity of a system. For instance, systems code must still verify at run time that a capability (i.e., a file descriptor, port, message queue, etc.) is valid when using SFI or hardware protection. In contrast, systems designed around strong, statically-typed languages such as Modula-3 and Standard ML (SML), provide abstraction mechanisms, including objects, abstract datatypes, polymorphism, and automatic memory management, that enforce read/write safety as well as capabilities at compile-time. For instance, we can define an abstract type message queue with appropriate operations using the abstype mechanism of SML. The type system of SML prevents programmers

Y. Minamide, G. Morrisett, and R. Harper. Typed closure conversion. In POPL '96: Proceedings of the 23rd ACM SIGPLAN-SIGACT symposium on Principles of programming languages, pages 271-283, New York, NY, USA, 1996. ACM. [ bib | .ps ]

Closure conversion is a program transformation used by compilers to separate code from data. Previous accounts of closure conversion use only untyped target languages. Recent studies show that translating to typed target languages is a useful methodology for building compilers, because a compiler can use the types to implement efficient data representations, calling conventions, and tag-free garbage collection. Furthermore, type-based translations facilitate security and debugging through automatic type checking, as well as correctness arguments through the method of logical relations.

We present closure conversion as a type-directed, and type-preserving translation for both the simply-typed and the polymorphic lambda-calculus. Our translations are based on a simple "closures as objects" principle: higher-order functions are viewed as objects consisting of a single method (the code) and a single instance variable (the environment). In the simply-typed case, the Pierce-Turner model of object typing where objects are packages of existential type suffices. In the polymorphic case, more careful tracking of type sharing is required. We exploit a variant of the Harper-Lillibridge "translucent type" formalism to characterize the types of polymorphic closures.

D. Tarditi, G. Morrisett, P. Cheng, C. Stone, R. Harper, and P. Lee. Til: a type-directed optimizing compiler for ml. In PLDI '96: Proceedings of the ACM SIGPLAN 1996 conference on Programming language design and implementation, pages 181-192, New York, NY, USA, 1996. ACM. [ bib | .ps ]

G. Morrisett. Compiling with Types. PhD thesis, Carnegie Mellon University, December 1995. [ bib | http ]

Compilers for monomorphic languages, such as C and Pascal, take advantage of types to determine data representations, alignment, calling conventions, and register selection. However, these languages lack important features including polymorphism, abstract datatypes, and garbage collection. In contrast, modern programming languages such as Standard ML (SML), provide all of these features, but existing implementations fail to take full advantage of types. The result is that performance of SML code is quite bad when compared to C. In this thesis, I provide a general framework, called type-directed compilation, that allows compiler writers to take advantage of types at all stages in compilation. In the framework, types are used not only to determine efficient representations and calling conventions, but also to prove the correctness of the compiler. A key property of type- directed compilation is that all but the lowest levels of the compiler use typed intermediate languages. An advantage of this approach is that it provides a means for automatically checking the integrity of the resulting code. An important contribution of this work is the development of a new, statically- typed intermediate language, called lambda-MLi . This language supports dynamic type dispatch, providing a means to select operations based on types at run time. I show how to use dynamic type dispatch to support polymorphism, ad-hoc operators, and garbage collection without having to box or tag values. This allows compilers for SML to take advantage of techniques used in C compilers, without sacrificing language features or separate compilation. To demonstrate the applicability of my approach, I, along with others, have constructed a new compiler for SML called TIL that eliminates most restrictions on the representations of values. The code produced by TIL is roughly twice as fast as code produced by the SML/NJ compiler. This is due at least partially to the use of natural representations, but primarily to the conventional optimizer which manipulates typed, lambda-MLi code. TIL demonstrates that combining type-directed compilation with dynamic type dispatch yields a superior architecture for compilers of modern languages.

G. Morrisett, M. Felleisen, and R. Harper. Abstract models of memory management. In FPCA '95: Proceedings of the seventh International Conference on Functional programming languages and computer architecture, pages 66-77, New York, NY, USA, 1995. ACM. [ bib | .ps ]

R. Harper and G. Morrisett. Compiling polymorphism using intensional type analysis. In POPL '95: Proceedings of the 22nd ACM SIGPLAN-SIGACT symposium on Principles of programming languages, pages 130-141, New York, NY, USA, 1995. ACM. [ bib | .ps ]

Traditional techniques for implementing polymorphism use a universal representation for objects of unknown type. Often, this forces a compiler to use universal representations even if the types of objects are known. We examine an alternative approach for compiling polymorphism where types are passed as arguments to polymorphic routines in order to determine the representation of an object. This approach allows monomorphic code to use natural, efficient representations, supports separate compilation of polymorphic definitions and, unlike coercion-based implementations of polymorphism, natural representations can be used for mutable objects such as refs and arrays.We are particularly interested in the typing properties of an intermediate language that allows run-time type analysis to be coded within the language. This allows us to compile many representation transformations and many language features without adding new primitive operations to the language. In this paper, we provide a core target language where type-analysis operators can be coded within the language and the types of such operators can be accurately tracked. The target language is powerful enough to code a variety of useful features, yet type checking remains decidable. We show how to translate an ML-like language into the target language so that primitive operators can analyze types to produce efficient representations. We demonstrate the power of the “user-level” operators by coding flattened tuples, marshalling, type classes, and a form of type dynamic within the language.

N. Haines, D. Kindred, J. G. Morrisett, S. M. Nettles, and J. M. Wing. Composing first-class transactions. ACM Trans. Program. Lang. Syst., 16(6):1719-1736, 1994. [ bib ]

R. Harper and G. Morrisett. Compiling with non-parametric polymorphism (preliminary report). Technical report, Carnegie Mellon University, Pittsburgh, PA, USA, 1994. [ bib ]

There is a middle ground between parametric and ad-hoc polymorphism in which a computation can depend upon a type parameter but is restricted to being defined at all types in an inductive fashion. We call such polymorphism non-parametric. We show how non-parametric polymorphism can be used to implement a variety of useful language mechanisms including overloading, unboxed data representations in the presence of ML-style polymorphism, and canonical representations of equivalent types. Moreover, we show that, by using a second-order, explicitly typed language extended with non-parametric operations, these mechanisms can be implemented without having to tag data with type information at runtime. Finally, we demonstrate that this approach retains a "phase distinction" permitting type checking and separate compilation. Our aim is to provide a unifying language, translation, and proof framework in which a variety of non-parametric mechanisms can be expressed and verified.

R. Harper and G. Morrisett. Compiling polymorphism using intensional type analysis. Technical report, Carnegie Mellon University, Pittsburgh, PA, USA, 1994. [ bib ]

Types have been used to describe the size and shape of data structures at compile time. In polymorphic languages or languages with abstract types, this is not possible since the types of some objects are not known at compile time. Consequently, most implementations of polymorphic languages box data (i.e., represent an object as a pointer), leading to inefficiencies. We introduce a new compilation method for polymorphic languages that avoids the problems associated with boxing data. The fundamental idea is to relax the requirement that code selection for primitive, polymorphic operations, such as pairing and projection, must be performed at compile time. Instead, we allow such operations to defer code selection until link- or even run-time when the types of the values are known. We formalize our approach as a translation into an explicitly-typed, predicative polymorphic lambda-calculus with intensional polymorphism. By "intensional polymorphism", we mean that constructors and terms can be constructed via structural recursion on types. The form of intensional analysis that we provide is sufficiently strong to perform non-trivial type-based code selection, but it is sufficiently weak that termination of operations that analyze types is assured. This means that a compiler may always "open code" intensionally polymorphic operations as soon as the type argument is known @y(M) the properties of the target language ensure that the specialization will always terminate. We illustrate the use of intensional polymorphism by considering a "flattening" translation for tuples and a "marshalling" operation for distributed computing. We briefly consider other applications including type classes, Dynamic types, and "views".

G. Morrisett and M. Herlihy. Optimistic parallelization. Technical report, Carnegie Mellon University, Pittsburgh, PA, USA, October 1993. [ bib | .ps ]

J. G. Morrisett. Refining first-class stores. In In Proceedings of the ACM SIGPLAN Workshop on State in Programming Languages, pages 73-87, 1993. [ bib | .ps ]

J. G. Morrisett and A. Tolmach. Procs and locks: a portable multiprocessing platform for standard ml of new jersey. In PPOPP '93: Proceedings of the fourth ACM SIGPLAN symposium on Principles and practice of parallel programming, pages 198-207, New York, NY, USA, 1993. ACM. [ bib | .ps ]

We have built a portable platform for running Standard ML of New Jersey programs on multiprocessors. It can be used to implement user-level thread packages for multiprocessors within the ML language with first-class continuations. The platform supports experimentation with different thread scheduling policies and synchronization constructs. it has been used to construct a Modula-3 style thread package and a version of Concurrent ML, and has been ported to three different multiprocessors running variants of Unix. This paper describes the platform's design, implementation, and performance.

J. M. Wing, M. Faehndrich, N. Haines, K. Kietzke, D. Kindred, J. G. Morrisett, and S. Nettles. Venari/ML interfaces and examples. Technical report, Carnegie Mellon University, Pittsburgh, PA, USA, 1993. [ bib ]

N. Haines, D. Kindred, J. G. Morrisett, S. M. Nettles, and J. M. Wing. Tinkertoy transactions. Technical report, Carnegie Mellon University, Pittsburgh, PA, USA, 1993. [ bib ]

We describe the design of a transaction facility for a language that supports higher-order functions. We factor transactions into four separable features: persistence, undoability, locking, and threads. Then, relying on function composition, we show how we can put them together again. Our "Tinkertoy" approach towards building transactions enables us to construct a model of concurrent, nested, multi-threaded transactions, as well as other non-traditional models where not all features of transactions are present. Key to our approach is the use of higher-order functions to make transactions first-class. Not only do we get clean composability of transactional features, but also we avoid the need to introduce special control and block-structured constructs as done in more traditional transactional systems. We implemented our design in Standard ML of New Jersey.

J. M. Wing, M. Faehndrich, J. G. Morrisett, and S. Nettles. Extensions to Standard ML to support transactions. Technical report, Carnegie Mellon University, Pittsburgh, PA, USA, 1992. [ bib ]

J. G. Morrisett and A. Tolmach. A portable multiprocessor interface for standard ml of new jersey. Technical report, Carnegie Mellon University, Pittsburgh, PA, USA, 1992. [ bib ]

E. Cooper and J. G. Morrisett. Adding threads to Standard ML. Technical Report CMU-CS-90-186, Carnegie Mellon University, Pittsburgh, PA, USA, December 1990. [ bib | .ps ]


This file was generated by bibtex2html 1.96.