Changes from original release ----------------------------- Things keep changing, though hopefully for the better. The impact of the changes is getting smaller. As usual, we try to highlight the most important of the changes here. We've put the most recent changes at the top of this list (the list eventually becomes unordered since changes from many releases ago were never ordered in this list in the first place). o Release of machsuif-1.3.0 represents the integration of machsuif-1 with basesuif-1. The machsuif patches have been incorporated in basesuif-1.3.0.x. Thus the machsuif/doc/basesuif-changes document is now of historical interest only. o There are now passes for dead code elimination, copy propagation, and local common subexpression elimination. o The x86 code generator is now included on an ``as is'' basis. It still needs a lot of testing. o The system now builds on Windows NT using the Cygnus Win32 tools. (Not tested yet for machsuif-1.3.0, but quite likely to work fine.) o Lots of bug fixes that mean we can now compile a larger set of benchmarks on our Alpha machines. Please see the list in ReleaseNotes. o We have added yet another new environment variable to our machsuif world. Again, please see the ``overview.ps'' document for the full details. In brief, you now need to set ``MACHSUIF_TARGET_OS'' to compile with machsuif. The format of impl directory files changed slightly too. These changes were forced by the next listed item. o The pass aexp now expands all instructions possible into their machine-level equivalent sequences. On Digital UNIX 4.0, that means that we create an assembly language file with relocation annotations used by the Alpha assembler. Since you cannot put relocations into an assembly file on DU 3.2, machsuif needs to know the OS of the target machine. Please see the man page of aexp for more details. By the way, aexp can now be run multiple times within a single compilation. o We have added another set of helper routines to the machine library. These routines organize the complexity of effective-address calculations into one module. Please use these routines when writing new passes. We've updated all of the Alpha passes to use them. o We deleted the annotation ``k_end_params_spill'' since it is no longer needed if you use anything but a really dumb register allocator. I.e., it is not needed by raga. o We have added a new dataflow analysis to our dfa library. o We have re-organized the documentation. There is now a ``doc'' directory in the machsuif source tree. Please read the ``overview.ps'' document contained there. o There are several new environment variables in our machsuif world. Please see the ``overview.ps'' document. Please note that all of our environment variables for machsuif now start with ``MACHSUIF''. This means that the ``SUIFTARGET_*'' variables have become ``MACHSUIF_TARGET_*''. o We have also create a ``machsuif/impl'' directory to hold all of the architecture and machine-specific data files, such as the register def files. Please see the README documentation in that directory. o We have separated the ``reginfo'' class from the ``archinfo'' class. Now, we have two global variables: ``target_arch'' and ``target_regs'', both which must be defined early in your passes. Please see the machine library documentation for more information. In short, any references to ``target_arch->reg'' need to be replaced with a reference to ``target_regs''. o We have re-organized the code base slightly so that all of the machine-specific information is kept in a small number of places. This change should facilitate the construction of new code generators. Please see the machsuif overview documentation (Section 5 on ``Adding Support for New Architectures''). o We have generated documentation for the machine library using noweb. The files are in the machsuif/machine directory. You do NOT need noweb to read the documentation or to create the source header files. We have also included documentation for the cfg and dfa libraries. We hope to soon include a document describing the interface between the code generators, the register allocator, and the code generator finishing passes. o We are releasing a working and (mostly) tested code generator for Digital Alpha machines. (The distribution also contains an increasingly out-of-date MIPS-I code generator, that used to produce code for some of the SPECint92 benchmarks. Others are take responsibility for this code generator.) o We have separated register allocation from code generation. Please see the machsuif documentation for more information on the interface between these passes. It should be fairly straightforward to insert different register allocation routines, and a single register allocation routine can be made to work for many different instruction set architectures (e.g. the same raga code handles both MIPS and Alpha ISAs). o We have made changes to our changes for the src/basesuif/suif operand class. In our original distribution, we added two new operand kinds: registers and immediates. We have removed the immediate operand kind. Below, we list several easy steps to take to convert your current machsuif passes to use the new release of our library (see HELPFUL HINTS #1). We also changed the name of the ``is_node_reg()'' method to ``is_virtual_reg()'' since the definition of a compiler virtual register changed. See the machine library documentation. o Virtual registers now have their own number manager. Ideally, this manager would exist in the basesuif/suif code. Since we didn't want to modify basesuif that much, we've added this manager by extending the proc_symtab class locally, and by adding functionality to our {Read,Write}_machine_proc() helper functions. Below, we list several easy steps to take to convert your current machsuif passes to use the new virtual register manager (see HELPFUL HINTS #4). o We have made changes to the handling of store operations. In our original distribution, we reserved srcs[0] to hold only store effective address calculations. We have removed this hack and instead added an annotation k_is_store_ea to distinguish load effective address calculations from store effective address calculations. The mi_rr and mi_bj classes have three new methods to automate the handling of this annotation. You can now use srcs[0] as any other source operand. We also removed the *src1*(), *src2*(), and *dst_addr_op() methods to remove any confusion. Below, we list several easy steps to take to convert your current machsuif passes to the new release of our library (see HELPFUL HINTS #2). o We have made some modifications to the way in which machine description information is handled in machsuif. In the current system, we tag each file with architectural information. Specifically, each file_set_entry in the fileset contains a structured annotation of the following form: ["target_arch": # string # integer # string # string ] The "architectural family name" is a string and is the same as the string stored in the architecture() method of each instruction in a procedure tree_node_list. It quickly distinguishes between MIPS and Alpha instructions, for instance, but it does not distinguish between a MIPS-I and a MIPS-IV architecture. Distinguishing revisions within an architectural family is handled by the "architectural revision number" value. For example, this value might be "1" for MIPS-I and "4" for MIPS-IV. The third value in this annotation contains the vendor and operating system information for the target machine. This item completes the -- string kept in the SUIF $MACHINE environment variable. The final value in this annotation is the name of a machine description file. This value can be overridden by the SUIFMDFILE environment variable, in case you want to experiment with different architectural features. Since this architectural information may be queried a number of times during a machsuif pass, the machine library provides a helper routine and data structure to extract the information from the k_target_arch annotation. See the classes in archInfo.{h,cc}. o We have made some changes to the handling of machine registers. In the original distribution, we hardcoded the abstract register numbers and built architecture-specific mapping functions between the abstract register identifiers and the actual hard register names. We've now generalized this functionality and made it part of an architecture information file. You can now make changes to the types of registers in your architecture without recompiling the machine library. For more details, please see the archInfo.h section of the machine library documentation. Below, we list several easy steps to take to convert your current machsuif passes to the new release of our library (see HELPFUL HINTS #3). o In addition to the new k_target_arch annotation, we have added a bunch of other useful annotations. Please annoteHelper.{h,cc} in src/machsuif/machine and the machine library documentation for more specifics. Several of these annotations are required by machsuif passes during the compilation process. o Given the new k_target_arch annotation, printmachine no longer requires a target architecture on the command line. o We added a lcc-like, table-driven macro method of handling the opcode information. See the *Ops.{h,cc} and *.data files in src/machsuif/machine. o At the request of outside collaborators, we have made milist into a doubly-linked list. o And of course, we fixed lots of bugs. *** HELPFUL HINTS #1: The following is a brief explanation of how to update your existing machsuif passes to use the new operand immediate methods: 1. is_immed() operand_dataonly method -- check and possibly update code operand.h still has this method, but it's meaning has changed slightly. If TRUE, it now means that the operand is a pointer to a ldc instruction. The proper sequence of checks to perform to differentiate a simple immediate operand from an effective address calculation in machSUIF is as follows: If operand is OPER_INSTR kind, use Is_ea_operand() to differentiate operands representing effective address calculations from simple immediate operands. You may not need extra checks (i.e. the is_immed() operand method is sufficient) in some contexts (e.g. the 2nd operand of an io_add effective address calculation cannot be another EA operand in a base-plus-offset addressing mode). 2. immediate() operand_dataonly method -- check and possibly update code operand.h still has this method, but like is_immed(), it's effect has changed slightly. This method returns the immed value from the ldc instruction of an is_immed() operand. Again, you should check the context to ensure proper use of this method. 3. set_immed() operand method -- eliminated This method has been eliminated. It was never used in the original machSUIF code. If you used it, replace code with code that creates a new operand instead. 4. operand(immed &im, type *t) operand constructor -- eliminated To replace this constructor, I've created a helper function in machsuif/machineInstr.h called 'Immed_operand()'. It takes the same parameters as the original immediate operand constructor, but creates an instruction pointer to a ldc of the specified immediate value. Wherever you had the old constructor, just change the 'operand' to 'Immed_operand'. 5. sanity checks -- check and possibly update code o Check places where is_instr() or is_expr() operand_dataonly methods are used. These could identify a simple immediate operand now. If you just want operands representing effective address calculations, use the Is_ea_operand() helper routine. o Check for uses of OPER_IMMED. This kind no longer exists. Replace with check for OPER_INSTR, and then differentiate kind using the Is_ea_operand() helper routine. o Check to verify that the result_type of all effective address calculations is a pointer type. If not, make it so. *** HELPFUL HINTS #2: The following is a brief explanation of how to update your existing machsuif passes to eliminate the old restriction on srcs[0]: 1. Search for any uses of the following old methods of mi_rr and mi_bj, and change them to the new methods: eliminated method replacement method ----------------- ------------------ src1_op() src_op(0) src2_op() src_op(1) set_src1(opnd) src_op(0,opnd) set_src2(opnd) src_op(1,opnd) dst_addr_op() store_addr_op(0) set_dst_addr_op(opnd) set_store_addr_op(0,opnd) Note that ``src1_op()'', ``src2_op()'', and ``dst_addr_op()'' and their ``set'' equivalents are still valid methods for the SUIF in_rrr class. We have just eliminated them from machsuif since we prefer to use ``src_op(int)''. Class-specific methods make it hard to write generic passes. 2. The conversion in item 1 shifts the operand array down from srcs[1]-and-up to srcs[0]-and-up. Obviously, this is only a good thing to do if you are *not* using srcs[0] for the effective address of the store in the instruction in question. 3. If you use the ``src_op(int)'' and ``set_src_op(int,operand)'' methods already. You want to make sure that you shift these references down too. Sorry, this is a messy change. You simply have to search for all ``src'' references in your code. You must check these references, because the mi_rr and mi_bj constructors now, for all instructions, start placing operands in srcs[0] by default (instead of in srcs[1]). The nice side effect of this change is that the effective address calculation for loads and stores in RISC instruction sets are now both in srcs[0]. 4. The ``store_addr_op(int)'' and ``set_store_addr_op(int,operand)'' methods take care of the annotation bookkeeping for you. Notice that any source operand can now be a store effective address. Please see the machine library documentation if you really care how we implemented things. Also, notice that helper routines like ``Writes_memory(instruction *)'' work without change to their interface syntax. Also, if you previously used ``set_src_op(0, opnd)'' to create a store effective address operand, you *must* now use ``store_addr_op(0,opnd)''. If you don't, the operand will be interpreted as a load effective address calculation. 5. The last change concerns passes that change existing instructions into new instructions. Since the information that marks an operand as a store effective address operand is different from the contents of the operand in question, it is *not* simply sufficient to overwrite the operand to (say) change a store instruction into a register-to-register add instruction. If you want to change an operand that is a store effective address calculation into an operand that is not a store effective address calculation, you must use the ``remove_store_addr_op(int)'' method to clean-up the store effective address bookkeeping. You do not need to ``remove'' if you just change the effective address calculation in a store. *** HELPFUL HINTS #3: The following is a brief explanation of how to update your existing machsuif passes to use the new register model: 1. All machsuif passes must have access to the target_arch information. Previously, the *gen passes created this information (by creating the k_target_arch annotation) but never actually had a variable called target_arch. With the ``great register reorg,'' the register information is attached to the target_arch information and thus we need the target_arch from the start. You only need to add the following code after the place where you created the k_target_arch information: /* get architectural information for this file_set_entry */ target_arch = new archinfo(fse); Make sure that you set the last immed in the k_target_arch annotation to be the name of your *Arch.data file. If your pass already read the k_target_arch annotation in your loop over the file_set_entry's, then you only need to make sure that you don't generate new storage for the target_arch variable (the storage is defined now in the machine library). You should delete the target_arch variable at the end of your loop over the file_set_entry's. 2. Add ``-ll'' to your Makefile LIBS line after ``-lmachine'' since the code in the machine library now needs the lex library. You need to do this for every pass that uses the machine library. 3. Search for all uses of ``REG_'' in your code and fix the syntax to match the new syntax. E.g., ``REG_tmp2'' becomes ``REG(GPR,TMP,2)''. Notice that ``REG_const0'', ``REG_sp'', etc. don't change. Search for any use of the other REG_BASE_* constants and eliminate them (e.g. use target_arch->reg->start_of(b,c) instead). 4. Search for any uses of the *_no_of_* variables that used to identify the number of a particular kind of hard register, and replace this code with the appropriate call to target_arch->reg->num_of(...). 5. Search for any uses of the *_reg_no(int) and *_to_machine_reg_no(int) helper functions. These are removed from the library (and should be removed from your extensions to the library) since their functionality is covered by the reginfo class. 6. If you added a new architecture, you need to create a machine description file like machsuif/machine/alphaArch.data. *** HELPFUL HINTS #4: The following is a brief explanation of how to update your existing machsuif passes to use the new virtual register manager. Please note that you need to perform these steps only if your pass needs to insert virtual registers. If your pass doesn't create virtual registers, you don't need to change anything! 0. We removed all references (at least we hope all references) to ``node registers'' and their semantics from our code. Instead, we use virtual registers and their more lenient semantics. This change simplified just a few places in our code. You may want to do the same. 1. By convention, we place the current procedure symbol pointer into the global variable ``proc_sym *cur_psym''. You should now declare another global variable called ``mproc_symtab *cur_psymtab''. 2. If you want to use the virtual register manager in your machsuif pass, you should do the following in your equivalent of Process_file(). (Otherwise, you don't have to change anything--skip the remaining steps and you shouldn't have done step 1.) o have Read_machine_proc() set the variable cur_psymtab; o just before Write_machine_proc(), renumber the virtual registers by calling ``cur_psymtab->renumber_vregs();'' o you may delete cur_psymtab after Write_machine_proc(). 3. Change all uses of -(int)cur_psym->block()->proc_syms()->next_instr_num() into NEW_VREG 4. If this is a *gen pass, then the following #define in the topmost header file for your pass: #define NEW_VREG (-(int)cur_psym->block()->proc_syms->next_instr_num()) Otherwise, use the following definition instead: #define NEW_VREG cur_psymtab->next_vreg_num() See the section on I/O helpers in the machine library documentation for an explanation of why things are this way.