9999

Extending SUIF for Machine-specific Optimizations

Michael D. Smith
smith@eecs.harvard.edu
Division of Engineering and Applied Sciences
Harvard University
Compatible with SUIF Release 1.1.2
Revised July 28, 1997


Introduction

The SUIF compiler provides an excellent set of flexible libraries for parallel and machine-independent optimizations. This document and its associated documents describe a set of modifications and extensions to the base SUIF library that provide the abstractions necessary for machine-specific optimizations, such as global instruction scheduling. We have designed these modifications and extensions so that existing SUIF code compiles without change. Furthermore, we have designed this code base so that it is easy to add support for new instructions or instruction set architectures, to include new analysis or machine-specific optimization passes, and to experiment with new hardware structures or organizations. We hope that you find this system useful. Enjoy!

As with base SUIF, Machine SUIF is built around a core library, the machine library consisting of the files in the machine subdirectory of the machsuif distribution package. Currently, the machine library relies on the suif and useful libraries distributed with basesuif. On top of the machine library, we have constructed several other libraries and many machine-specific optimization passes.

This document serves as an overview of the Machine SUIF system. It presents the philosophy behind the design of Machine SUIF, directions for using and extending the system, and pointers to the other documentation available. Throughout our documents, it is assumed that the reader is familiar with the SUIF system and has read the SUIF overview document [cite bibsuif].

We use the noweb system [cite bibnoweb] by Norman Ramsey for the majority of our documentation. This literate programming tool lets you combine documentation and code in the same source file. It is our convention to use noweb to document only the most important header files in the Machine SUIF libraries, i.e. the files that describe a library's interface. We use man pages and simple C comments to document the interesting portions of Machine SUIF passes and library implementation files. The machsuif/doc directory contains several other documents that a user of Machine SUIF might find helpful. A ``roadmap'' to these documents can be found in Section [->].

The following section discusses the inter-operability and research goals that drove the design of our machine library. Section [->] describes how you need to setup your computing environment in order to compile and use Machine SUIF. Section [->] introduces the flow of passes in Machine SUIF. Further information on the ordering of passes and the contracts governing individual passes can be found in the Machine-SUIF Interfaces document. Section [->] briefly describes how one would go about extending our system. Among other things, this section explains how you would extend the machine library to support a new architecture. Section [->] will get you started into the rest of our documentation. Finally, Section [->] lists those places that are less than perfect in the current system, while Section [->] summarizes the status, availability, and future plans for Machine SUIF.

Goals for a Machine-specific Library

When we began this project in the spring of 1994, we had two distinct goals in mind. The first goal was to maintain inter-operability with the existing SUIF infrastructure. Our system should dovetail seamlessly into SUIF, and the key philosophical tenets of SUIF should be maintained. The second goal was to develop an infrastructure that would meet the needs of a research program in computer architecture and machine-specific code optimization. Sections [->] and [->] discuss the design decisions that resulted from these goals, and they briefly describe the implications for users of our system.

Inter-operability goals

There are both practical and philosophical reasons behind the desire to maintain inter-operability with the SUIF system. To be precise, we define inter-operability to mean that the machine library extends the SUIF system without the need to duplicate functionality unnecessarily. The current SUIF infrastructure provides many tools that are useful for machine-specific compiler research, and we want to exploit this existing infrastructure as much as possible.

The machine library re-uses the basesuif infrastructure whenever possible. Other than a few extensions to the existing SUIF instruction and operand classes, our library simply derives a set of new instruction classes, the machine instruction classes, from the existing SUIF in_gen instruction class (as described in detail in machine library documentation). By carefully crafting our machine instruction classes and implementing key helper routines, we are able to re-use many existing utilities. For example, we use a straightforward wrapper around the SUIF methods that read and write procedures. We are thus independent of the implementation details of the input/output stream routines. In addition, we carefully maintain the symbol table and other high-level information passed down from the front end. Architecture-specific assembly-language data statements are generated only when outputting ASCII assembly-language text, i.e. when leaving the SUIF system. As a testament to the clean interface between the base SUIF system and our machine library, it typically takes us less than one day to port our machine library to the latest distribution of SUIF. [This is also a testament to the stability of SUIF.]

We also strived to maintain the philosophical underpinnings of SUIF. Whenever possible, we chose design decisions that allow for easy modification and expansion of our base system. At the same time, we attempted to create abstractions that were general. For example, our single instruction class abstraction is powerful enough to support both RISC and CISC instruction set semantics, and yet, it follows the structure of the base SUIF instruction class. Mapping functions for the base SUIF instruction class work without change on the machine instruction classes. Finally, our machine extensions support the development of machine-specific passes implemented as separate programs that link with the core SUIF libraries. As with the original SUIF system, this is inefficient in terms of compile time, but extremely flexible and an ideal platform for evaluating new architectural ideas.

Research goals and related work

Research on machine-specific compile-time optimizations is closely tied to research in computer architecture. To support this area of research, a compilation system must possess an intermediate form (IF) that is both extensible and expressive. We require extensibility so that we can experiment with new compiler-visible instructions, architectural features, and hardware mechanisms. Since SUIF was built with extensibility as a primary design goal, it was simple for us to meet this requirement. We desire expressiveness in our IF so that every single machine instruction is representable by a single IF instruction. A one-to-one correspondence between IF instructions and actual machine instructions is required for optimizations such as global instruction scheduling.

The development of an extensible and expressive IF could be a lifelong project within itself. Since we were primarily interested in the compiler as a tool, we focused on the development of an IF that met our needs without greatly disrupting the structure of the IF already present in the SUIF system. We considered two basic approaches to satisfy the goal of a one-to-one correspondence between IF instructions and machine instructions. The approaches differ in where they perform the actual mapping from an IF opcode to a machine opcode.

The first approach postpones this mapping until the last possible moment in the compilation process (see Figure 1a). To ensure that there is a one-to-one correspondence between IF and machine instructions, there is an earlier pass in the compilation process that restricts the IF form so that each IF instruction can be mapped directly to a single machine instruction. The machine-specific optimization passes maintain this one-to-one correspondence. The IMPACT compiler [cite bibimpact] employs this type of an approach.


.c --> front-end --> IF-restriction --> optimizations --> code-gen --> .s
(a)
.c --> front-end --> code-gen --> optimizations --> .s
(b)
Two approaches that satisfy the goal of a one-to-one correspondence between IF and machine instructions. The first splits the code generation into two pieces. The second relies on either retargetable compiler technology to facilitate the quick creation of machine-specific optimization passes or abstraction to eliminate the need to re-code each optimization for each architecture.

The second approach performs the mapping from IF instructions to machine instructions early in the compilation process (see Figure 1b). Machine SUIF employs this approach. To keep from having to re-implement each machine-specific optimization for each target instruction set architecture, the specifics of an instruction are hidden by abstraction techniques when those specifics are not needed for optimization. As a result, we use the same register allocation pass, for example, for several different machine architectures.

Alternatively, one could use retargetable compiler technology, as done in the vpcc/vpo compiler [cite bibvpo], to achieve the same result. The designers of a compiler pass write their passes without reference to any specific machine instruction set. They then use a compiler compiler to generate the actual compiler for a specific instruction set.

Given complete freedom over the design process, it is unclear to us that one approach is significantly better than the others. Of course, we did not have complete freedom in the design process. We chose our approach for several practical reasons. First, the IMPACT approach requires us to be able to extend the SUIF instruction class arbitrarily. The existing SUIF system has defined a specific way to extend its instruction class, i.e. through io_gen's, and hence it is simpler for us to ``restrict'' the IF by performing the mapping all at once. Furthermore, existing SUIF passes handle io_gen's generically, and thus this approach gives us some semblance of inter-operability with the existing SUIF code base. Since SUIF is implemented in C++, the existing class structure provides us with the ability to abstract away details as required by the second approach. Thus, we do not need to build a compiler compiler to obtain the benefits of abstraction.

Before leaving this section on goals, we would like to mention that the ability to produce runnable code has and always will be a foremost goal of our compiler. Not only will our system produce code for experimental architectures, but it will be capable of producing good code for a range of existing instruction set architectures. This ability will help our compiler system to evolve as technology advances.

Setting Up Your Environment

The following discussion assumes that you are using a UNIX-based operating system as your development platform. It also assumes that you know how to unpack the distribution and set-up the SUIF source directory. If not, please first review the README files contained in the distributions.

In addition to the environment variables required by base SUIF (i.e. MACHINE, SUIFHOME, COMPILER_NAME and possibly NEED_RANLIB, USER_CFLAGS, and USER_CXXFLAGS), we have created several other environment variables used by Machine SUIF. Some of these environment variables are used when you compile the Machine SUIF compiler. We discuss these variables in Section [->]. Section [->] overviews other environment variables that you may find useful during SUIF compilation of a target application. Before we get into these details however, Section [->] introduces the four different types of source modules in the Machine SUIF world.

Kinds of passes/libraries

There are two distinguishers that together separate every piece of Machine SUIF source code into one of four categories. The first distinguisher differentiates code that is organized as a compiler pass from code that acts as a common library of routines. For example, the code in src/machsuif/machine is the source for the machine library while code in src/machsuif/agen uses this machine library to perform Digital Alpha code generation. The second distinguisher differentiates source code that requires target-machine-dependent information during its compilation from code that is target-machine-independent. Please remember that all of our optimizations use machine-specific information during compilation. The issue here is whether the compiler code is specific to a particular target machine or whether it can be used generically for any (properly abstracted) target machine. For example, the code in src/machsuif/agen is target-machine-dependent code because it produces only Alpha intermediate-form instructions. The code in src/machsuif/raga, on the other hand, is target-machine-independent because it performs register allocation for any of our target machine architectures.

To summarize, we have four kinds of source directories in the machsuif distribution:

  1. Machine-independent passes, e.g. raga.

  2. Machine-dependent passes, e.g. agen.

  3. Machine-independent libraries, e.g. the cfg library.

  4. Machine-dependent libraries, e.g. the machine library.

Compiling Machine SUIF

The compilation of the source files in the machsuif distribution depend upon a set of environment variables of the form MACHSUIF_TARGET_*, where * is replaced by a machine architecture name (e.g., ALPHA). If an environment variable like MACHSUIF_TARGET_ALPHA is defined, this tells the make process that you would like your SUIF compiler to be able to produce code for Digital Alpha targets. By setting more than one of these variables, you create a SUIF compiler capable of producing code for multiple targets.

Each of these environment variables has a corresponding makefile variable of the form M_*. Again, we replace the * with a machine architecture name. So, M_ALPHA is the makefile variable associated with MACHSUIF_TARGET_ALPHA. The M_* makefile variables are passed as preprocessor define variables during a make process. These variables are checked in the source code to include/exclude machine-specific portions of code.

Referring to the four kinds of source directories described in the previous subsection, we can now explain how the makefile structure of Machine SUIF works. The makefile for a machine-independent pass (e.g. raga) will not contain any references to M_* define variables. (Why? Because it is machine-independent, of course.) Furthermore, the M_* define variables do not appear anywhere in the source code of a machine-independent pass.

The makefile for a machine-dependent pass (e.g. agen), on the other hand, will contain uses of a single M_* variable (M_ALPHA for this example pass). In particular, the EXTRA_CFLAGS and EXTRA_CXXFLAGS lines of this pass's Makefile will contain -DM_*. Notice that you know exactly what define flag is required since a machine-dependent pass is specific to only one architecture.

The makefile for a machine-independent library (e.g. cfg), like a machine-independent pass's makefile, does not contain any references to M_* define variables. If compilation of a library of this kind fails, then you did not build a machine-independent library. The same type of error checking applies for machine-independent passes.

The makefile for a library containing machine-dependent code (e.g. machine) is slightly more complex than anything mentioned earlier. See the Makefile in src/machsuif/machine as an example. Among other things, it includes the file src/machsuif/Makefile.defs. This file is the central location for checking the environment variables MACHSUIF_TARGET_* and then setting the appropriate M_* variables. The library itself should use M_* to set off machine-specific code from machine-independent code. Please try to place all of the machine-specific code in a single module. If you create source files that are specific to only one architecture, then you can use the *_{HDRS,SRCS,OBJS} makefile variables to enable/disable their compilation based on the MACHSUIF_TARGET_* environment variables. Please see the machine library as an example.

Compiling with Machine SUIF

To perform machine-specific optimizations, we must have access to information about the organization and microarchitecture of the target machine. Except for the definition of the machine instructions and assembler directives [This information is maintained in the src/machsuif/machine directory.] , we place the rest of the target-machine-specific organizational and microarchitectural information in data files in the directory src/machsuif/impl. When you compile Machine SUIF, these data files are installed in $(SUIFHOME)/include/impl for use during target machine-code generation. If you want something other than $(SUIFHOME)/include/impl as the directory where machine-specific information lives, you can override this default location for the impl directory with the MACHSUIF_IMPL_DIR environment variable.

In the impl directory, data files have names of the form

<family>-<version>-<implementation>.<extension>

where each of <family>, <version>, <implementation>, and <extension> are string values. <family> is the architectural family name for the target machine; <version> is the reversion number of the architectural specification; <implementation> indicates a particular hardware realization of it; and a file's <extension> indicates what kind of information it contains.

As we will see in a moment, the <family>, <version>, and <implementation> values provide us with a rational namespace to describe machine-specific features. First however, we will briefly mention how these values are recorded and used. For more information on this topic, we encourage you to read the document describing our machine library. As mentioned later in Section [->] on the basic ordering of our compiler passes, it is the task of a *gen pass to translate a low-SUIF representation of a code module into a machine-SUIF representation of this same code module. During this translation, the *gen pass will mark each file_set_entry with four target configuration identifiers: <family>, <version>, <implementation>, and <vendor-os>. We have already mentioned three of these four identifiers. The last, <vendor-os>, records the vendor and target operating system information.

We obtain the values used for <family>, <version>, <implementation>, and <vendor-os> in several ways. The value for <family> is determined solely by the code generator invoked. For example, if you set the -Target flag for scc to be alpha-dec-osf [Typically, the value of this string is retrieved from your MACHINE environment variable. You use a different value if you are cross-compiling.] , the SUIF compiler would invoke a Digital Alpha code generation pass that would define the architectural family of the target to be alpha. We specify the values for the <version>, implementation>, and <vendor-os> strings in one of two ways. One way to specify this information is to use the -ver, -impl, and -os command line options of the *gen passes. An example of this method is shown in Section [->]. Alternatively, if you define the environment variables, MACHSUIF_TARGET_VERSION, MACHSUIF_TARGET_IMPL, and MACHSUIF_TARGET_OS, our *gen passes will use them. The command line arguments will override the environment variables, and if neither command line arguments nor environment variables are defined, the *gen pass will throw an assertion.

The machine library uses the values of <family>, <version>, and <implementation> to search the impl directory for machine-specific information. The library performs this search every time you read a Machine-SUIF intermediate form file from disk. In addition, the library performs this search from MOST-specific to LEAST-specific name. So for example, alpha-1-21064A-4M.reg would override any settings in alpha-1.reg or alpha.reg. This allows us to collect common descriptions in a single file that is appropriate for any hardware organization of an architectural version or any implementation of a particular architecture family (i.e., alpha.reg is equivalent to alpha-*-*.reg) .

The <extension> on the impl directory files indicates the kind of machine-specific information contained in that file. For example, the .reg extension indicates that this file contains a description of register file organizations and software conventions. The README.* files in the impl source directory discuss the kinds of machine-specific information files supported and the syntax of these files. The partitioning corresponds to the interfaces that we have built in the machine and associated libraries for accessing machine-specific information. We do not claim that this partitioning is ideal in any sense; the current one just makes intuitive sense to us. Eventually, the crufty nature of these files will be hidden by the use of the University of Virginia's Computer Systems Description Language (CSDL).

In theory, we could extend the naming structure in the impl directory to include the <vendor-os> string since, for instance, different OSs could use the same processor but different register conventions. We do not use the <vendor-os> string in the impl directory names because different operating systems may change the ways in which the hardware is used by the software, but it does not change the hardware directly. To support different software conventions and interfaces, we allow specific impl files, e.g. the *.reg files, to include multiple convention definitions, each marked by the associated <vendor-os> strings. This is a relatively minor choice in the grand scheme of things, but it keeps our impl directory namespace manageable.

In summary, we define and use the following environment variables in Machine SUIF:

  1. MACHSUIF_TARGET_*. This set of variables are used during the compilation of the Machine SUIF system to indicate what target architectures should be supported by the resulting SUIF compiler. (The binding of each such variable is ignored. Its existence in the environment is what matters.)

  2. MACHSUIF_IMPL_DIR. This variable is used during compilation with the SUIF compiler to override the default location for the target-machine-specific implementation files.

  3. MACHSUIF_TARGET_VERSION. This variable defines the version string for the target architecture.

  4. MACHSUIF_TARGET_IMPL. This variable defines the implementation string for the target machine.

  5. MACHSUIF_TARGET_OS. This variable defines the vendor and operating system string name for the target machine. Typically, this matches the last two items in the MACHINE environment variable.

Ordering of Machine SUIF passes

Like the rest of SUIF, we attempt to minimize the requirements on the ordering of Machine SUIF passes. There are however certain assumptions that must be met. This section talks about the general structure of a back-end implemented in Machine SUIF. The specific requirements of each Machine SUIF pass are fully documented in its man page. Further information on the ordering of passes and the contracts between individual passes can be found in the Machine-SUIF Interfaces document.

Interface between SUIF and Machine SUIF

The back-end starts with a code generator pass that converts the ``low-suif'' representation of a program into a Machine SUIF representation of that program for a particular machine architecture. The simplest way to describe the interface between Machine SUIF and the rest of the SUIF system is to say that a Machine-SUIF code generator expects an input file containing only low-suif constructs---all high-suif constructs have been decomposed into low-suif constructs.

So, what is low-suif? Good question. There is no formal definition of low-suif, and there probably should not be, given the one-to-many mapping between the SUIF intermediate form and target machine instruction sets. Since this is a research compiler infrastructure, it is conceivable that someone would create a new ``high-suif'' construct that was passed directly to a particular Machine-SUIF code generator (and decomposed for other Machine-SUIF code generators).

Today, we run (at least) the following SUIF passes to translate a C program into a low-suif representation acceptable to our Digital Alpha code generator. The machine target specified in the scc command should be changed to match your target machine environment. Note that the options to swighnflew may be different for each target architecture. You would change the initial scc command appropriately to compile a FORTRAN program.

<Generate machsuif-palatable low-suif code>=
scc -Target alpha-dec-osf -.spd $f.c
porky -Darrays -Dfors -Dloops -Difs -Dblocks -no-call-expr \
  -Ddivfloors -Ddivceils -Dmins -Dmaxs -Dmods \
  $f.spd $f.spx
swighnflew -no-struct-return -mark-struct-alignment \
  -mark-varargs __builtin_va_start $f.spx $f.sfl

To this point, we have described information flowing in only one direction: from the front-end to the back-end. It is conceivable and desirable to have information flow in the opposite direction too. There does not currently exist a formal convention for representing this flow of information.

A minimal back-end

The purpose of the back-end is to translate a low-suif representation of a program into an equivalent assembly-language representation of that program. This translation typically requires the following two basic steps: translate low-suif instructions into a specific machine's assembly instructions (code generation); and translate SUIF symbol and instruction-pointer operands into machine registers and memory locations (register allocation). A minimal Machine-SUIF back-end consists of four steps related to these two basic steps: perform most of the code generation; perform register allocation; finish machine-specific translation; and print the ASCII representation of the Machine-SUIF binary file. The following commands perform these steps to generate a Digital Alpha assembly file:

<A minimal Alpha back-end>=
agen -ver 1 -impl 21064A-4M -os dec-osf3.2 $f.sfl $f.acg
raga $f.acg $f.ara
afin $f.ara $f.af
printmachine $f.af $f.s

The agen pass performs most of the machine-specific code generation. It does not perform any register allocation (beyond what is necessary, for example, because a certain instruction uses a specific, implicit register). It does however translate instruction pointer operands representing compiler virtual registers into virtual register operands.

As mentioned earlier in Section [<-], *gen passes like agen define the machine-specific target information associated with an intermediate file. By default, when you invoke agen (as opposed to say mgen), you implicitly pick an Alpha target running Digital UNIX. If you wanted another target operating system, you would select a different version of agen. (Currently, we ship a version of agen that supports Digital UNIX, a.k.a. OSF1, versions 3.2 and 4.0.) Additionally, we have specified via command line options that the target machine implements version 1 of the Alpha instruction set architecture, its hardware organization contains an Alpha 21064A microprocessor with a 4MB second-level cache, and the target operating system is Digital UNIX version 3.2. The operating system string is used by agen; agen simply embeds the values of the other two parameters into the output file for use by later machine-specific optimization passes.

The raga pass performs register allocation, translating all register-allocated symbols and virtual registers into hard registers. This register allocator is based on the technique described by George and Appel [cite bibraga].

The afin pass completes the translation process by laying out the stack frame for each procedure, replacing stack-allocated symbols by their stack-based effective-address calculation, and inserting the procedure entry and exit code sequences. These entry/exit sequences are not created before this point so that other Machine SUIF passes can easily reallocate registers in a procedure, for example, and not have to change the register save/restore code at that procedure's entry/exit points.

The printmachine pass translates the Machine SUIF representation for a program into an architecture-specific ASCII assembly-language file (a .s file). This pass also creates the data pseudo-ops by translating the file's symbol table. Before this point, it is expected that all Machine SUIF passes will maintain the symbol table (and not create machine-specific data pseudo-ops in the instruction lists). The resulting .s file can then be assembled by the target machine's assembler to create a .o file.

Of these essential passes, only agen and afin are written in a machine-specific manner. The raga and printmachine passes are written in a machine-independent manner, even though they perform machine-specific actions and optimizations. Thus, if we wanted to create a MIPS assembly language file, we would replace agen and afin by mgen and mfin respectively. You will gain a better understanding of how we create machine-independent passes that perform machine-specific optimizations after you read the machine library documentation.

A fancier back-end

An equally important purpose of the back-end is to perform machine-specific optimizations. We place machine-specific optimization passes between the code generator and printmachine. For example, if we had a global instruction scheduling pass called twine and a target architecture like the Alpha 21064A that required instruction scheduling to achieve maximum performance, we might run the following passes:

<A scheduling back-end for the Alpha 21064A>=
agen -ver 1 -impl 21064A-4M -os dec-osf3.2 $f.sfl $f.acg
aexp -use_vregs $f.acg $f.ae1
twine -preschedule $f.ae1 $f.aps
raga $f.aps $f.ara
aexp $f.ara $f.ae2
twine -postschedule $f.ae2 $f.ags
afin $f.ags $f.af
aexp $f.ara $f.ae3
twine -bbschedule $f.ae3 $f.afs
fix-alpha-gp $f.afs $f.afx
printmachine $f.afx $f.s

The filename extensions are unimportant. This back-end includes two other machine-specific passes, aexp and fix-alpha-gp. The aexp pass expands all pseudo-ops into real machine instructions so that we can schedule all of the machine instructions. When run after code generation, this pass uses compiler virtual registers during pseudo-op expansion; it is not limited to the use of the Alpha assembler temporary register. Since the register-allocation and the finishing passes may also use pseudo-ops, aexp is run after each--these times using assembler-temporary hard registers however. The fix-alpha-gp pass cleans up the world with respect to the calculation of the global data pointer. These calculations are sensitive to their location in the code and thus must be finalized after all scheduling is complete.

Scheduling takes place three times in this example. Once before register allocation (pre-scheduling), once after register allocation (post-scheduling), and once after the finishing pass. We schedule the code each time for an Alpha 21064A machine model. This directive came from the impl command line parameter specified during agen. The use of pre- and post-schedulers is typical approach when the compiler separates register allocation and instruction scheduling. The final scheduling pass simply schedules the procedure entry and exit code sequences added during the finishing pass. The assumption here is that afin marks the code sequences that it inserts as ones that need to be scheduled. The basic block scheduler does not touch any of the previously scheduled sequences.

As discussed earlier, aexp and fix-alpha-gp are a machine-specific passes like agen and afin. All of these passes could be generated automatically by a compiler compiler from very simple machine description files. The instruction scheduling (twine) and register allocation (raga) passes are machine-independent passes that perform machine-specific optimizations. These passes are also constructed so that you may rerun them multiple times. Code sequences are marked so that the optimization in question is enabled/disabled.

Adding Support for New Architectures

This section assumes that you are familiar with Machine SUIF in general, and Section [<-] and the machine library in particular. The information in this section should help you get started toward the goal of extending Machine SUIF to support a new machine architecture.

Begin by picking a name for the new architecture, analogous to ALPHA in MACHSUIF_TARGET_ALPHA. For clarity of discussion, let us assume that your new target architecture is called MYMACH. Then modify the makefiles in any machine-specific library and in the top-level directory src/machsuif to include your architecture's information during the build of each machine-dependent library and each MYMACH-specific pass. Obviously, the makefile in a MYMACH-specific pass should be configured as described in Section [<-].

Next, extend each machine-specific library to know about your new architecture. For the rest of this section, I will assume that the machine-specific library in question is machsuif/machine. A similar set of steps would be needed for any other machine-specific library.

In the directory src/machsuif/machine,

Finally, create the essential machine-specific passes. In particular, you must at least create an initial code generator pass *gen and a final code generation pass *fin. See Section [<-] for a brief description of the functionality of these passes. You can find more detailed information concerning the interface between key Machine SUIF passes in the Machine-SUIF Interfaces document.

You may also create an expander pass *exp that expands all of the assembly pseudo/macro ops into real machine instructions. This pass is necessary only if you wish to use instruction scheduling passes in Machine SUIF.

As long as your architecture is similar to those already supported under Machine SUIF, you should not need to change any other Machine SUIF pass.

The Rest of the Documentation

In addition to this document, we provide several other postscript documents for your reading pleasure. As mentioned earlier, our convention is to use noweb to document the header files of our Machine SUIF libraries and man pages to document the purpose and restrictions on Machine SUIF passes. Below, we summarize the list of documents available describing the Machine-SUIF compilation system:

Known Shortcomings with Machine SUIF

It's pretty clear to us that Machine SUIF is far from perfect. The following is a list of action items that we're working on:

  1. We are well on our way to completing the transition of our register model. The final step that is still incomplete is to provide a mapping from value types to register banks. Right now, the register allocator just ``knows'' that, for example, an integer belongs in the gpr bank while a single-precision floating-point number goes in the fpr bank. We'd like to remove this hard-coded information from the register allocator pass.

  2. There are a few outstanding issues involved in the handling of two-operand instruction sets like x86 that need to be stated explicitly in the documentation.

  3. Further out, the ARPA/NSF infrastructure project will eventually provide a nice general machine description language on top of this code. Tools will be provided to generate the machine-specific information required by the machine library and the Machine SUIF passes automatically.

Summary and Availability

We have developed a machine-specific library that adheres to the philosophy of the existing SUIF infrastructure and yet also supports research in computer architecture and machine-specific code optimization. At Harvard, we are using or have used this machine library to support the construction of several code generators (including RISC-based and CISC-based architectures), register allocators, global instruction schedulers, and several code transformations to support our work in branch prediction and code layout. Though we continue to identify new annotations and helper routines, the core of the machine library appears stable.

A version of the machine library is available by anonymous ftp from ftp.eecs.harvard.edu in pub/hube (the same code is available from our web site, http://www.eecs.harvard.edu/~hube/). The current release works with version 1.1.2 of the base SUIF system, and it is organized as a ``super-package'' (machsuif) that requires only basesuif. SUIF is available from http://suif.stanford.edu/. Questions, comments, and bug reports for this package should be e-mailed to machsuif-bugs@eecs.harvard.edu. As the technology matures, this package will migrate into the SUIF distribution, and we will track the updates to the base SUIF system. As other machine-specific optimizations become available, we will add them to our distribution site.

Acknowledgments

I gratefully acknowledge the support of the HUBE research group at Harvard and the SUIF research group at Stanford. In particular, I would like to acknowledge the help of Cliff Young and Glenn Holloway. Cliff helped me to implement our extensions to the SUIF system, revamped the control-flow graph library, and authored the HALT package for instrumentation and profiling under SUIF. Glenn wrote our first real register allocator and helped to implement our dataflow analysis library.

This work is supported in part by an DARPA/NSF infrastructure grant (NDA904-97-C-0225) and a NSF Young Investigator award (CCR-9457779). We also gratefully acknowledge the generous support of this research by Advanced Micro Devices, Digital Equipment, Hewlett-Packard, International Business Machines, Intel, and Microsoft.

References

[1] M. Benitez and J. Davidson. ``Target-specific Global Code Improvement,'' Dept. of Computer Science Technical Report CS-94-42, University of Virginia, November 1994.

[2] L. George and A. Appel. ``Iterated Register Coalescing,'' Transactions on Programming Languages and Systems, 18(3):300--324, May 1996.

[3] IMPACT Research Group. See http://www.crhc.uiuc.edu/Impact/.

[4] N. Ramsey. ``Literate Programming Simplified,'' IEEE Software, 11(5):97--105, September 1994.

[5] Stanford Compiler Group. The SUIF Library. The SUIF compiler documentation set, Stanford University, 1994.