The SUIF Control Flow Graph Library

Cliff Young
cyoung@eecs.harvard.edu
Division of Engineering and Applied Sciences
Harvard University
Compatible with SUIF Release 1.1.2
and Machine SUIF Release 1.1.2
Revised April 17, 1998


Introduction

The SUIF compiler provides an excellent set of flexible libraries for parallel and machine-independent optimizations. With the machine SUIF extensions, it also supports machine-dependent optimizations such as global instruction scheduling. This document discusses the SUIF control flow graph (CFG) library. It discusses design goals, the interface to the library, implementation decisions, current status, and planned updates.

The SUIF CFG library presents an abstraction of control flow graphs built on the basic structures of the SUIF system. The CFG library allows creating CFGs, analyzing them, transforming them by rearranging and reconnecting blocks, and fine-grained control over individual program instructions within the CFG.

The CFG library currently works with machine SUIF, and extensions to support low SUIF are implemented but not tested. The library consists of files in the cfg subdirectory of the machsuif distribution package. The Makefile in cfg creates the CFG library. This paper focuses on the design and use of that library.

This paper serves as reference documentation for those researchers who wish to use the CFG library. It is assumed that the reader is familiar with the SUIF system and has read both the SUIF overview document [cite bibsuif] and the machine SUIF overview document [cite bibmach].

The following section discusses the development and research goals that drove the design of our CFG library. Section [->] describes our view of how the library will be used. Sections  [->] through  [->] describe the programming abstraction provided by the CFG library. Section [->] describes library implementation decisions. Section [->] summarizes the status and availability of the code.

Goals for the CFG Library

[*]

This section lists our goals for the CFG library.

The next section gives an overview of how we expect the library to be used.

Philosophy

[*]

The CFG library supports three major styles of operation. These are graph-oriented, layout-oriented, and instruction-oriented. Users specify the mode they plan to use at cfg construction time. However, a graph can be transformed from graph to layout style and back (or it can be left somewhere in between) by a sequence of layout commands.

In graph-oriented mode, the library presents the high-level abstraction of a graph of edges and nodes. Each node is a maximal basic block of instructions. The library hides the details of the underlying linear code representation, allowing a programmer to think in terms of graph traversal and connectivity without having to explicitly set branch targets or insert new control instructions. Despite the graph abstraction, control over the program remains precise: the successors of a block correspond one-to-one to the successors of the underlying control instructions.

Under the covers, graph-oriented mode does a small amount of work for the user. To represent graph edges explicitly, the library makes all branches explicit in the underlying SUIF program. It does this by inserting shadow gotos at the end of all fall-through instructions in the program. These shadow gotos are only removed when layout commands are issued.

Layout-oriented mode gives the programmer direct control over the order of basic blocks and instructions in the machine code representation. Doing so comes at a cost in complexity, however. In layout mode, the programmer must be aware of the positions of shadow gotos, and may have to promote them into their own basic blocks before a layout command becomes legal. We feel that this is not a huge restriction, because most layout or code motion optimizations require instruction-level control anyway. This design restriction also keeps library intelligence to a minimum, preventing the library from performing ``magical'' work behind the back of a low-level pass.

Layout-oriented mode also allows instruction schedulers to control placement of instructions inside of basic blocks. In particular, it is possible to support scheduling for delayed-branch architectures by placing instructions after the control instruction of a block. However, in keeping with our design goals, it is up to such schedulers to manage such non-standard instruction placement themselves. The library will need to be told where the control instruction in a block lives; the library also assumes that there is just one control instruction per block. The library will also not be able to read code that has been scheduled for delayed-branch architectures; however, sub-classes could be built that perform the necessary construction-time adjustments.

Instruction-oriented mode is the third and final method of working with the library. In this mode, the library presents each instruction as its own node in the CFG. This is useful for some optimizations, which prefer single instructions per CFG node over maximal basic blocks as CFG nodes. No layout support is currently given in instruction-oriented mode. But since the programmer is already working at the instruction level, he already has to manage updating explicit targets and control instructions.

The Sections  [->] through  [->] describe the interface that attempts to support the models presented above.

Major Classes in the Library

[*]

This section describes the user interface supported by the CFG library. It begins with the basic data structures, cfgs and cfg_nodes. Then we discuss routines that support the goals listed in the previous section, including analysis, CFG transformations, layout, and scheduling support. After discussing these features, we return to examine the CFG constructor.

The cfg Class

cfg and cfg_node are the two main classes used in the CFG library. A cfg represents an entire control flow graph for a procedure. A cfg is constructed around a particular SUIF procedure; each procedure can have only one associated cfg at a time. Modifications made to the cfg and its cfg_nodes are propagated to the underlying SUIF procedure. The cfg class is defined in the graph.h header file.

<graph.h>=
/*  Control Flow Graph */

<SUIF Copyright Notice>

#ifndef CFG_GRAPH_H
#define CFG_GRAPH_H

#pragma interface

class cfg_node;
class cfg_block;
class cfg_node_list;

/*
 *  The control flow graph defined here is intended to be general enough
 *  for almost everyone to use (at least as a base -- other information
 *  can be added on top of it).  It supports nodes containing either
 *  individual tree_instrs or basic blocks of contiguous tree_instrs.
 *  High-level tree nodes are represented by multiple flow graph nodes
 *  corresponding to the code that will be created when they are expanded.
 *  Moreover, each high-level tree node is delimited by a pair of BEGIN
 *  and END nodes to make it easier to match up the flow graph with the
 *  SUIF code.
 */

DECLARE_X_ARRAY(cfg_node_array, cfg_node, 10);

class cfg {
    friend class cfg_node;

<cfg private members>
<cfg protected members>
<cfg public members>
};

<cfg edge iterator>
#endif /* CFG_GRAPH_H */

cfg private and protected members are defined and described in Section [->], on implementation details.

Here's the visible interface to the cfg class. The print() routine lists the nodes by number, together with successor and predecessor lists. It includes layout information if its boolean parameter is true. The print_with_instrs() method is similar, but also prints the instructions of nodes that contain them, using low SUIF notation.

<cfg public members>= (<-U)
  public:
<cfg SUIF access>
<cfg node access>
<cfg edge counts>
<cfg transformation>
<dominator methods>
<natural loop methods>
<reverse postorder listing>
<cfg cleanup>
    void print(FILE *fp, boolean show_layout=FALSE);
    void print_with_instrs(FILE *fp, boolean show_layout=FALSE);
<cfg constructors and destructors>

Each cfg numbers its component nodes. The num_nodes() method tells how many nodes are in the cfg; they can be accessed by the cfg::node(unsigned) method or by using the overloaded array index operator. Each cfg also has a special entry node and a special exit node. These can be found using the eponymous methods. The ability to set the entry and exit nodes is a historical artifact; changing these nodes is not recommended.

<cfg node access>= (<-U)
    unsigned num_nodes();
    cfg_node *node(unsigned i)          { return (*nds)[i]; }
    cfg_node *&operator[](unsigned i)   { return (*nds)[i]; }

    cfg_node *entry_node()              { return en; }
    cfg_node *exit_node()               { return ex; }
    void set_entry_node(cfg_node *n);
    void set_exit_node(cfg_node *n);

In designing the library, we chose not to represent edges in the cfg explicitly. Edges are represented by pairs of nodes or by a pair of a node and an index into the node's successor list. The first representation treats the cfg as a graph, while the second treats the cfg as a multigraph (the distinction is important if some of the numbered targets of a conditional or multiway branch are repeated). Two methods return the number of edges or multiedges currently in the cfg: num_edges() and num_multi_edges(). Note that the implementation of these two methods is currently slow; an incremental calculation of the number of edges or multiedges is possible, but has not yet been implemented.

<cfg edge counts>= (<-U)
    unsigned num_edges(); 
    unsigned num_multi_edges(); 

The library allows edges to be enumerated using a special iterator, the cfg_edge_iter class. Unlike most SUIF iterator methods, the cfg_edge_iter returns void when its step() method is called. To find the head node, and the tail node or successor number that define an edge, use the curr_head(), curr_tail(), and curr_snum() methods. Note that curr_snum() works only if the iterator was initialized with multigraph set to TRUE. Also, correct iteration is not guaranteed if the cfg changes while iterating. Lastly, note that the step() method must be called before any of the curr_*() methods may be called.

<cfg edge iterator>= (<-U)
class cfg_edge_iter {
    boolean multigraph; 
    int curr_node; 
    int curr_index; 
    cfg *base; 
public:
    cfg_edge_iter (cfg *base, boolean multigraph = FALSE); 
    void reset(void); 
    boolean is_empty() const; 
    void step(); 

    cfg_node *curr_head(void) const; 
    cfg_node *curr_tail(void) const; 
    unsigned curr_succ_num(void) const; 
}; 

The cfg transformation and analysis public methods are described in more detail in Section [->] and Section [->], respectively. We defer discussion cleanup routines until after Section [->] on layout, and we defer discussion constructors, destructors, and SUIF access to Section [->].

The cfg_node class

cfg_nodes represent nodes in the CFG. cfg_node is a virtual parent class. More detail on the cfg_node subclasses appears at the end of this section.

We chose not to supply an edge data type. Data or annotations associated with edges can be placed in vectors that parallel the successors vector in each node. We felt that interposing an edge structure between nodes would just complicate traversing the edges. Users who wish to add instructions along an edge can use the empty block creation methods described in the Transformations section.

cfg_nodes get their own header file, node.h. Here it is:

<node.h>=
/*  Control Flow Graph Nodes */

<SUIF Copyright Notice>

#ifndef CFG_NODE_H
#define CFG_NODE_H

#pragma interface


<cfg_node_kind definition>
/*
 *  Each flow graph node contains only an ID number and information
 *  related to the flow graph.  Other information can be associated with
 *  a node by storing it in a separate array in the cfg structure where
 *  the node ID number is used to index into the array.
 *
 *  This is a virtual class that cannot be instantiated.
 */

class cfg_node;
<cfg_node_list_iter definition>

class cfg_node {
    friend class cfg;

<cfg_node private and protected members>
  public:
    cfg_node();
    virtual ~cfg_node() {}

<cfg_node subclass identification>
<cfg_node numbering and ownership>
<cfg_node relations to other nodes>
<cfg_node layout relations>
<cfg_node transformation methods>
<cfg_node layout methods>
<cfg_node empty block methods>
    virtual void print(FILE *fp=stdout) = 0;
};

<definitions of cfg_node subtypes>
#endif /* CFG_NODE_H */

Calling the number() method gives the number of cfg_node in its parent cfg. In case a pointer to the owning cfg is not handy, one can be recovered using the parent() method.

<cfg_node numbering and ownership>= (<-U)
    unsigned number()                   { return nnum; }
    cfg *parent()                       { return par; }

Each cfg_node keeps a list of predecessors and a list of successors. The list of predecessors is an unordered set, while the list of successors is a vector corresponding to the target(s) of the node's terminating instruction (or just the instruction in instruction mode). The predecessor list is unordered because we could not imagine a useful ordering to implement; it is a set (and not a multiset) because having a set seems a simpler abstraction. The successors list is numbered; positions 0 and 1 in the list have special meaning depending on the kind of control instruction that terminates the block. The same target can appear multiple times in the successors list, so that different cases of a multiway branch can jump to the same label. The number of successors of a cfg_node depends on the control instruction that ends the basic block:

The cfg_node::succs(void) and cfg_node::preds(void) methods return the lists of node successors and predecessors. Individual nodes can then be accessed using the standard SUIF list access methods.

<cfg_node relations to other nodes>= (<-U)
    cfg_node_list *preds()              { return &prs; }
    cfg_node_list *succs()              { return &scs; }
    cfg_node *fall_succ()               { return (*succs())[0]; }
    cfg_node *take_succ()               { return (*succs())[1]; }

We defer discussing transformation, layout, and empty block methods to the corresponding later sections. Empty block methods are discussed under the transformation section.

Specific subclasses of cfg_nodes include labels, instructions or basic blocks, the special begin and end nodes, and test nodes from C for loops. To distinguish the different kinds of cfg_nodes, we have a simple enumeration:

<cfg_node_kind definition>= (<-U)
/*
 *  Control flow graph nodes are divided into several subclasses
 *  depending on the sort of SUIF object(s) to which they correspond.
 */

enum cfg_node_kind {
    CFG_BEGIN,                          /* beginning marker for AST node */
    CFG_END,                            /* ending marker for AST node */
    CFG_INSTR,                          /* instruction or expression tree */
    CFG_BLOCK,                          /* basic block of instructions */
    CFG_TEST,                           /* "test" node of a FOR loop */
    CFG_LABEL                           /* implicit AST node label */
};

Each cfg_node implements identifier routines that allow the subclass to be determined. These work in the usual SUIF way:

<cfg_node subclass identification>= (<-U)
    virtual cfg_node_kind kind() = 0;
    boolean is_begin()                  { return (kind() == CFG_BEGIN); }
    boolean is_end()                    { return (kind() == CFG_END); }
    boolean is_instr()                  { return (kind() == CFG_INSTR); }
    boolean is_block()                  { return (kind() == CFG_BLOCK); }
    boolean is_test()                   { return (kind() == CFG_TEST); }
    boolean is_label()                  { return (kind() == CFG_LABEL); }

<definitions of cfg_node subtypes>= (<-U)
<cfg_marker definition>
<cfg_begin definition>
<cfg_end definition>
<cfg_label definition>
<cfg_instr definition>
<cfg_block definition>
<cfg_test definition>

Most of the interfaces to the cfg_node subclasses are pretty boring; they just indicate to the compiler that some methods need to be overridden. The exception is cfg_blocks, which have additional access and modification methods. Here is an English description of the purpose of each kind of cfg_node, along with the class definition for completeness:

Main Header File

Like every SUIF library, the CFG library has an umbrella include file that includes the relevant subsidiary include files:

<cfg.h>=
/*  Top-level Control Flow Graph Include File */

<SUIF Copyright Notice>

#ifndef CFG_H
#define CFG_H

<CFGINCLFILE macro>

<EXPORTED_BY_CFG macro>

#include CFGINCLFILE(graph.h)
#include CFGINCLFILE(node.h)
#include CFGINCLFILE(util.h)

#endif /* CFG_H */

The SUIF Copyright notice appears at the top of everything; here's the standard form:

<SUIF Copyright Notice>= (<-U <-U <-U U->)
/*  Copyright (c) 1994 Stanford University

    All rights reserved.

    Copyright (c) 1995, 1996, 1997 The President and Fellows of Harvard 
    University

    All rights reserved.

    This software is provided under the terms described in
    the "suif_copyright.h" include file. */

#include <suif_copyright.h>

This macro is triggered by the CFG library Makefile, so that we can use the header files both in the cfg source directory ($SUIFHOME/src/machsuif/cfg) and in regular SUIF programs.

<CFGINCLFILE macro>= (<-U)
/*
 *  Use a macro to include files so that they can be treated differently
 *  when compiling the library than when compiling an application.
 */

#ifdef CFGLIB
#define CFGINCLFILE(F) #F
#else
#define CFGINCLFILE(F) <cfg/ ## F ## >
#endif

The following macro selects the appropriate syntax for external variables. Special handling is required for compilation under Windows NT.

<EXPORTED_BY_CFG macro>= (<-U)
#if defined(_WIN32) && !defined(__CYGWIN32__) && !defined(CFGLIB)
#define EXPORTED_BY_CFG _declspec(dllimport) extern
#else
#define EXPORTED_BY_CFG extern
#endif

Analysis

[*]

The CFG library provides five forms of support for CFG-based program analysis. First, it provides access functions that allow programs to explore the structure of the graph and the instructions inside of blocks. Most of these access functions were described in the previous section on the major classes. Second, the library provides various kinds of iterators that allow programs to examine nodes within the CFG and instructions within cfg_blocks in various orders. Third, it computes dominance information and the dominator tree. Finally, it computes natural loops.

Iterators

Most CFG iterators are simple. Since CFG nodes are numbered, it is simple to iterate through them and assign them positions in bit vectors. There are also iterators over cfg_node_lists, which make up the succs() and preds() lists associated with each cfg_node. We get the cfg_node_list_iter class as part of the list class declaration macro.

<cfg_node_list_iter definition>= (<-U)
DECLARE_LIST_CLASS(cfg_node_list, cfg_node*);

The reverse_postorder_list() function provides a list of cfg_nodes in reverse postorder.

<reverse postorder listing>= (<-U)
    /* list generator routines */
    cfg_node_list *reverse_postorder_list(boolean forward=TRUE);

There is also an iterator that iterates through the instructions in a cfg_block from in_head() to in_tail(). The cfg_node_instr_iter constructor takes a boolean argument that specifies whether the iterator should iterate forward or backward.

<cfg_node_instr_iter>= (U->)
/* Iterator over instructions in a cfg_block */
class cfg_node_instr_iter {
  private:
    tree_node_list_e *first;
    tree_node_list_e *last;
    tree_node_list_e *sentinel;
    tree_node_list_e *cur;
    tree_node_list_e *nxt;
    boolean rev; 

  public:
    cfg_node_instr_iter(cfg_node *node, boolean reverse=FALSE);

    void reset();
    tree_instr *step();
    tree_instr *peek();

    boolean is_empty();
};

Dominators

The CFG library will compute the dominator and postdominator relations, as well as the dominance frontiers and postdominance frontiers of each node in the graph. Because this information may not always be used, it is only computed on request; make sure to call the appropriate find_ method before attempting to use any of the access routines. All of these methods are cfg methods; no individual cfg_node knows anything about these operations.

The dominates() (postdominates()) function returns the value of the dominance (postdominance) relation at the supplied points; it takes either node numbers or pointers to nodes. The immed_dom() (immed_postdom()) method returns the node that immediately dominates (postdominates) the node specified by number or pointer. Finally, the dom_frontier() (reverse_dom_frontier()) method returns a SUIF bit_set whose bits are set at the positions corresponding to the nodes in the (reverse) dominance frontier of the specified node.

<dominator methods>= (<-U)
    /* built-in analysis routines */
    void find_dominators();
    void find_postdominators();
    void find_df();                     /* dominance frontier */
    void find_reverse_df();

    /* access dfa information */
    boolean dominates(int n_dominator, int n_dominatee);
    boolean dominates(cfg_node *dominator, cfg_node *dominatee);
    boolean postdominates(int n_dominator, int n_dominatee);
    boolean postdominates(cfg_node *dominator, cfg_node *dominatee);
    cfg_node *immed_dom(int n);
    cfg_node *immed_dom(cfg_node *n);
    cfg_node *immed_postdom(int n);
    cfg_node *immed_postdom(cfg_node *n);
    bit_set *dom_frontier(int n);
    bit_set *dom_frontier(cfg_node *n);
    bit_set *reverse_dom_frontier(int n);
    bit_set *reverse_dom_frontier(cfg_node *n);

Natural Loops

Natural loops are also computed only on demand. We use a method based on Algorithm 10.1 in Aho, Sethi, and Ullman. The loop_depth() functions return the number of natural loops that contain a node; this can be set using the set_loop_depth() methods, overriding the results of the find_natural_loops() computation. Before running find_natural_loops(), you must run find_dominators().

The is_loop_begin() method returns TRUE if the specified node dominates any of its predecessors. The is_loop_end() method returns TRUE if the specified node is dominated by any of its successors. And the is_loop_exit() method returns TRUE if the specified node has any successor with a different loop depth. Note that this may not be what wants for a loop exit criterion in all cases.

<natural loop methods>= (<-U)
    void find_natural_loops(); 
    int loop_depth(int n);
    int loop_depth(cfg_node *n);
    void set_loop_depth(cfg_node *n, int d); 
    void set_loop_depth(int n, int d); 

    boolean is_loop_begin(int n);       /* TRUE if block is loop entry */
    boolean is_loop_begin(cfg_node *cn);
    boolean is_loop_end(int n);         /* TRUE if block jumps to loop entry */
    boolean is_loop_end(cfg_node *cn);
    boolean is_loop_exit(int n);        /* TRUE if block is a loop exit */ 
    boolean is_loop_exit(cfg_node *cn);

Transformations

[*]

Optimizations that transform CFGs typically perform three actions: they clone existing nodes, they change the successors of a node to point somewhere else (most likely a clone), and they add new empty nodes to the CFG. For example, a loop unroller makes multiple copies of a loop body, then it modifies the loop exit branches so that they form a chain.

The CFG library supports cloning using the cfg_block *cfg::clone_into(cfg_block *) method. This method returns a new cfg_block, duplicating procedure-specific symbols like label_syms into the cfg's tree_proc symbol table and updating references to point to the new symbols. Symbol references are updated so that blocks can be cloned into other scopes; the most common use of this is to build a new version of a procedure while maintaining an original, unmodified version on the side. The returned cfg_block appears as a numbered node in the cfg, and its code appears at in the cfg's underlying SUIF procedure, but the code will be unreachable (dead) and all of its successor pointers will be cleared.

<cfg transformation>= (<-U) [D->]
    /* node insert/create routine -- must manually set preds/succs */
    cfg_block *clone_into(cfg_block *); 

To make the code reachable, the driver program must connect to the newly-cloned block using cfg_node::set_succ(unsigned n, cfg_node *). This method works in the obvious way, setting the nth successor of the target block to the supplied new successor. The former nth successor (if any) will be disconnected, and any other relevant data structures (like lists of predecessors) will be updated appropriately. For convenience, set_fall_succ() and set_take_succ() are also defined.

<cfg_node transformation methods>= (<-U)
    virtual void set_succ(unsigned n, cfg_node *) = 0; 
    void set_fall_succ(cfg_node *nod)   { set_succ(0, nod); }
    void set_take_succ(cfg_node *nod)   { set_succ(1, nod); }

The library also allows the user to specify the entry point of the procedure. If set_succ(0, foo) method is called on the cfg_begin node, then the request is taken to mean that block foo should be made the entry point of the procedure.

There are three methods for creating empty blocks. The first, cfg_block *cfg::new_empty_block(), creates a new block containing a single label instruction, then adds the block to the cfg with no edges connecting to it. As with clone_into(), the driver program is responsible for connecting the new block into the graph. The second method, cfg_block *cfg_block::insert_empty_block(unsigned n), also creates a block containing a dummy label instruction, but it inserts the new block along the nth successor edge of the base block. The third method expects a pointer to a successsor of the current block, then places the empty block along all edges from the current block to the successor block. Blocks created along edges are often called ``landing pads''; landing pads are useful in instruction scheduling, where code sometimes needs to be placed between a block and its predecessor, rather than in either one.

The insert_empty_block() methods attempt to be neutral to layout decisions. They should work preserve any layout relation along the edge being modified, inserting the new empty block into any existing layout chain.

<cfg transformation>+= (<-U) [<-D]
    cfg_block *new_empty_block(void); 

<cfg_node empty block methods>= (<-U)
    cfg_block *insert_empty_block(unsigned n); 
    cfg_block *insert_empty_block(cfg_node *dst); 

Layout

[*]

The CFG library supports basic block placement and layout by allowing the user to specify an ordering (partial or total) of the blocks in a cfg and its associated SUIF procedure. The ordering is specified by a doubly-linked list that runs through the cfg_nodes of a cfg. NULL pointers in this linked list indicate ``don't care'' orderings, while connected components (this document calls them ``clumps'') of the list will be laid out in order. This gives driver programs complete flexibility between specifying no order at all (all layout links NULL) to a total order on blocks.

To read layout information, the library provides cfg_node *cfg_node::layout_pred(), which returns the layout predecessor of a node, and cfg_node *cfg_node::layout_succ(), which returns its layout successor. Layout chains are doubly-linked.

<cfg_node layout relations>= (<-U)
    cfg_node *layout_pred() const       { return lpred; }
    cfg_node *layout_succ() const       { return lsucc; }

Setting layout information is performed by void cfg_node::set_layout_succ(cfg_node *). The set_layout_succ() function is somewhat expensive: it may move most of the instructions in the procedure's tree_node_list in order to satisfy the requested layout link. However, set_layout_succ() is also careful about branch polarity and explicit jumps, inverting branches and removing unnecessary unconditional jumps when possible. The value returned by set_layout_succ() indicates whether a conditional branch was inverted; if so, passes may need to update things like profile information to maintain correspondence.

<cfg_node layout methods>= (<-U)
    virtual void clear_layout_succ();
    virtual boolean set_layout_succ(cfg_node *);
    boolean set_layout_succ(label_sym *); 

<cfg_block layout>= (U->) [D->]
    void clear_layout_succ();
    void set_succ(unsigned n, cfg_node *); 
    boolean set_layout_succ(cfg_node *); 

If more direct control over individual blocks is desired, the CFG library provides direct methods for changing conditional branch polarity, setting and promoting shadow instructions, and getting the leading label instruction of a block. invert_branch() changes the polarity of a branch by changing its opcode to the opposite opcode (e.g. changing beq to bne). If the branch has a branch_edge_weights annotation, giving the edge frequencies of the block's two out-edges, then invert_branch() swaps these as well. promote_shadow() promotes the shadow goto at the end of a block to be the control instruction that terminates the block. set_shadow() sets the shadow goto to point to the targeted node (creating a new one if necessary), or removes any shadow goto if its argument is NULL. Note that set_shadow() does not update any layout pointer information, so use it with caution. get_label() returns the leading label instruction of a block, inserting a new label if none was present.

<cfg_block layout>+= (U->) [<-D]
    /* Opcode-changing utility routine.  Does not change succs. */
    void invert_branch(); 
    void promote_shadow(void); 
    void set_shadow(cfg_node *); 
    label_sym *get_label(); 

When no layout links have been specified, the library treats the cfg as a general graph and inserts shadow gotos as described above. Shadow gotos violate the standard concepts of basic blocks because a block ending in a conditional branch or call instruction followed by a shadow goto should actually be two basic blocks. The library insists that blocks with layout successors be valid standard basic blocks. This makes set_layout_succ() picky about the conditions in which it will allow a link to be set:

Further, it is illegal to change the fall-through successor of a node after its layout successor has been set (the library will allow requests to set the fall-through successor to the same value, though).

Driver programs may break the layout link between a node and its layout successor by calling void cfg_node::clear_layout_succ(void). Clearing the layout successor will cause the library to insert shadow gotos after fall-through, conditional branch, and call blocks and to demote unconditional branches to shadow gotos. All of the layout links in a cfg can be reset using cfg::remove_layout_info(void).

The cfg class provides a few other high-level ``cleanup'' routines. These cleanup routines perform cfg-level optimizations that are more ``smart'' than the usual atomic layout interfaces to the library. Some return boolean results to indicate when they have made any changes. Callers may want to loop through these methods for as long as progress is made.

<cfg cleanup>= (<-U)
    /* clean-up routines */
    boolean remove_unreachable_blocks(boolean delete_instrs=FALSE); 
    boolean merge_block_sequences(boolean break_at_call=FALSE);
    boolean optimize_jumps(); 
    void remove_shadows(); 
    void remove_layout_info(); 

Scheduling

[*]

The CFG library provides fine-grain support by allowing programs to directly modify the lists of instructions in basic blocks. As mentioned above, the cfg_block class supports methods based on tree_node_lists. These include methods that insert and delete instructions at specific points in the list.

<cfg_block scheduling>= (U->)
    /* Routines to add/delete non-CTI instructions to/from a block.
     * Modeled on tree_node_list routines (dlist.h). */
    tnle *push(tnle *new_e);    /* new_e inserted before head */
    tnle *pop();                /* pop first inst in block */
    tnle *append(tnle *new_e);  /* new_e inserted after tail */
    tnle *insert_before(tnle *e, tnle *pos);
    tnle *insert_after(tnle *e, tnle *pos);
    void insert_after(tree_node_list *mtnl, tnle *pos);
    tnle *remove(tnle *rem);    /* remove this from list */
    boolean contains(tnle *test); /* TRUE if test is in this */

Note that the library will not allow driver programs to delete the only instruction in a cfg_block; a label instruction will always be added as a placeholder for the body of the block. Dead blocks can be removed from the cfg and the SUIF tree_node_list by calling cfg::remove_unreachable_blocks(TRUE).

Also note that the insert_before(), insert_after(), and remove() methods verify whether the pos or rem argument actually occurs somewhere in this cfg_block. If not, then they will throw an assertion.

Use the instruction list access functions with care; the library performs no checking on the kinds of instructions inserted. The library will do its best to update branch targets after a block has been transformed by code motions, but it has a number of limitations. The library assumes that there is just one control instruction per block. Use the set_in_cti() method to inform the library if you change the control instruction in a block. The library should be able to correctly set taken conditional branch successors and multiway branch successors even after layout and code scheduling. However, if you try to change the fallthrough successor of a block, the library may insert a shadow instruction at the end of the block, possibly ruining any code motion efforts.

Constructing CFGs

[*]

<cfg constructors and destructors>= (<-U)
    cfg(tree_block *b,
        boolean build_blocks = TRUE,    /* if FALSE, 1 instr per block */
        boolean break_at_call = FALSE, 
        boolean keep_layout = FALSE);
    ~cfg();

The cfg constructor takes four arguments. The first specifies the procedure around which to build the CFG. The remaining three are flags with the following effects:

Users of the library in instruction mode will want to set build_blocks to FALSE. If build_blocks is FALSE, then keep_layout determines whether the constructed cfg starts out in layout-oriented or graph-oriented mode.

While constructing the cfg, some information about the surrounding SUIF environment is constructed and stored. In particular, the cfg stores the tree_proc around which it was built. This information can be accessed using the tproc() method.

<cfg SUIF access>= (<-U)
    tree_proc *tproc()                  { return tp; }

Helper Functions

[*]

This section lists a number of helper functions that don't fall into the traditional class hierarchies. These helper functions live in the util.h header file.

<util.h>=
/*  Control Flow Graph Utilities */

<SUIF Copyright Notice>

#ifndef CFG_UTIL_H
#define CFG_UTIL_H

<SUIF init and exit>
<connector helpers>
<cfg annotations>
<vcg support>
<cfg_node_instr_iter>
<cfg_node_list occurs check>
<new-instruction helpers>
#endif /* CFG_UTIL_H */

Initialization

Like any SUIF library, the CFG library needs to be initialized as part of the call to init_suif() and cleaned up when exit_suif() is called. This is done automatically through the magic of SUIF Makefiles, but the init_cfg() and exit_cfg() functions needed to be declared somewhere.

<SUIF init and exit>= (<-U)
/*  initialization and finalization functions */
void init_cfg(int &argc, char *argv[]);
void exit_cfg();

Annotations to Library Objects

Some annotations connect SUIF instructions back to their parent cfg_blocks in the cfg.

<cfg annotations>= (<-U)
/*  annotations to attach CFGs to SUIF objects */
EXPORTED_BY_CFG char *k_cfg_node;
EXPORTED_BY_CFG char *k_cfg_begin;
EXPORTED_BY_CFG char *k_cfg_end;
EXPORTED_BY_CFG char *k_cfg_test;
EXPORTED_BY_CFG char *k_cfg_toplab;

A number of helper functions use these annotations to look up the corresponding cfg_nodess and cfgs. The cfg_node *label_cfg_node(label_sym *l); method tries to find the cfg_node that l is at the top of. The cfg_node *cfg_begin_node(tree_node *n); returns the cfg_block or cfg_instr that holds the instruction if n is an instruction. If n is a tree_block, then it returns the cfg_begin node of the graph The cfg_node *cfg_end_node(tree_node *n); method works like cfg_begin_node(), except that it returns the cfg_end node if n is a tree_block.

<connector helpers>= (<-U)
/*  functions to find CFG nodes associated with various objects */
cfg_node *label_cfg_node(label_sym *l);
cfg_node *cfg_begin_node(tree_node *n);
cfg_node *cfg_end_node(tree_node *n);
cfg_test *cfg_test_node(tree_for *n);
cfg_label *cfg_toplab(tree_for *n);

Support for VCG (Visualization for Compiler Graphs)

The generate_vcg() function will write a graph description in a format suitable for the VCG tool to read and display graphically. The FILE *f argument should be an open file descriptor to which to write. See the documention for the VCG tool for more information about the VCG file format.

<vcg support>= (<-U)
extern void generate_vcg(FILE *f, cfg *cfg);    /* Dump vcg format file   */

Other Helpers

In the CFG library implementation, it turns out to be very common to check whether a particular cfg_node appears in a cfg_node_list. The helper predicate occurs() exposes this check for others to use.

<other helpers>= [D->]
/* Helper functions */
extern boolean occurs(cfg_node *cn, cfg_node_list *cnl); 

The helper expunge_instr_tree() excises and deletes an instruction node and all the instructions subtended by it from a CFG block, whether the subtended instructions are subtrees or are parented at the block level themselves. The latter case is important for work on low SUIF instruction lists in which expression trees have not been built.

<other helpers>+= [<-D]
extern void expunge_instr_tree(cfg_block *b, tree_instr *ti); 

The following helpers generate new instructions.

<new-instruction helpers>= (<-U)
/* Helper functions added by tjc */
instruction *New_lab(label_sym *l);
instruction *New_ubr(label_sym *l);
instruction *New_null();

Implementation Notes

[*]

This section describes implementation issues and decisions relevant to the CFG library.

One of the most difficult design problems involved supporting explicit, enumerated branch targets. In most modern architectures, control instructions have just one explicit target, with the fall-through successor serving as a second, implicit target. In a general graph, it is not always possible to ensure that the fall-through successor of a basic block ends up immediately after that block. We feel that having the library manage shadow gotos but enforcing constraints on shadow gotos in layout strikes the right balance between the graph abstraction desired by some CFG transformations and the fine-grain control required for global code motions. Using shadow gotos significantly simplified the implementation of set_succ().

The interesting code layout work happens inside set_layout_succ(). A layout request actually involves linking together two clumps of cfg_nodes, where a clump is a chain of already-linked blocks. The two blocks supplied to set_layout_succ() should be the last block of the top clump and the first block of the bottom clump. After checking that the blocks belong to the same cfg, set_layout_succ() examines their clumps. If the two clumps are the same, then the layout request would create a layout cycle, so the library generates an assertion failure. Next, set_layout_succ() checks to make sure that the bottom block appears in the successor list of the top block. If so, then we can remove any shadow gotos and set the polarity of the top block so that it falls through to the bottom block. If the layout successor is not a control successor, then an explicit goto will be required at the end of the block. Finally, set_layout_succ() moves the instructions in the SUIF procedure's tree_node_list so that the top and bottom clump are adjacent.

Machine SUIF support for multi-way branches requires assistance from each code generator: a number of special annotations must be attached to the mbr dispatch block. The current implementation assumes a dispatch table approach to multiway branches; other methods are not currently supported.

As much as possible, we have tried to keep the interface and implementation of the library simple. For example, branch inversion and removal of extra unconditional jumps happen only when set_layout_succ() is called, not when other CFG transformations are applied. The library might have tried to handle insertion of control instructions into cfg_blocks, splitting nodes and changing edges behind the scenes. In both these cases and many others, we felt it was better to err on the side of simplicity and precision than to provide an overly smart tool that could confuse users by doing too many things ``magically'' in the background.

For completeness, we list the private and protected class members here.

<cfg private members>= (<-U)
  private:
    tree_proc *tp;                      /* proc of this cfg */

    cfg_node_array *nds;                /* xarray of nodes */
    cfg_node *en;                       /* entry node */
    cfg_node *ex;                       /* exit node */

    bit_set *doms, *pdoms;              /* dominators and postdominators */
    cfg_node **idom, **ipdom;           /* immediate dominators and postdoms */
    bit_set *df, *rdf;                  /* dominance frontier and reverse df */
    int *lp_depth;                      /* loop depth */

<cfg protected members>= (<-U)
  protected:
    /* internal methods for building the graph */
    void add_node(cfg_node *n);
    cfg_node *graph_add_list(tree_node_list *list, cfg_node *prev,
        boolean block, boolean break_at_call, boolean keep_layout); 
    cfg_node *graph_add_tree(tree_node *t, cfg_node *prev, 
        boolean block, boolean break_at_call, boolean keep_layout); 

    /* helper routine to clone nodes or create new empty nodes */
    void attach(cfg_node *);

    /* dfa functions used internally */
    bit_set *dominators(boolean forward);
    cfg_node **immed_dominators(boolean forward);
    void dom_frontiers(cfg_node *x, boolean forward);

    /* helper routines for clean-up */
    void cfg_cleanup(cfg_node *nd);

<cfg_node private and protected members>= (<-U)
  private:
    unsigned nnum;                      /* node number */
    cfg *par;                           /* parent flow graph */
    cfg_node_list prs;                  /* cfg predecessors */
    cfg_node_list scs;                  /* cfg successors */

  protected:
    cfg_node *lpred;                    /* layout predecessor */
    cfg_node *lsucc;                    /* layout successor */

    void set_number(unsigned n)         { nnum = n; }
    void set_parent(cfg *p)             { par = p; }

    void add_succ(cfg_node *n);
    void remove_pred(cfg_node *n);
    void remove_succ(cfg_node *n);

    void print_base(FILE *fp);

<cfg_block private and protected members>= (U->)
  private:
    tree_node_list *tnl; 
    tree_instr *ti1;
    tree_instr *ti2;
    tree_instr *tiC;            /* CTI instruction, if any */
    tree_instr *tiS;            /* shadow goto, if needed */

  protected:
    void extend(tree_instr *t)          { ti2 = t; }
    void remove_explicit_targets(); 
    cfg_block *tnl_pred(void); 
    cfg_block *tnl_succ(void); 

Future Work

[*]

Some of the implementation of the scheduling support is still flaky. In particular, if you change the kind of terminating control instruction of a block using set_in_cti(), the library is not smart enough to fix up the successor and predecessor lists. For example, a constant propagator pass might eliminate a conditional branch instruction; the library doesn't handle this very well.

We'd like to make cfg_instr nodes work similarly to cfg_block nodes for layout and graph transformation purposes. But we haven't thought about it enough yet. The solution may involve removing cfg_instr objects entirely and forcing one instruction per cfg_block, or it may involve allowing users to create cfg_instr and cfg_label nodes as wrappers around the underlying SUIF instructions.

We plan to extend the implementation to include high SUIF constructs like FOR, WHILE, and IF...THEN constructs. This isn't implemented yet, and we haven't thought much about how to represent the hierachical structure of these constructs.

Summary and Availability

[*]

We have developed a CFG library that supports not only analysis of programs, but also CFG-level transformations, code layout, and fine-grained code motion in the SUIF environment. At Harvard, we have used this machine library to build compiler passes including dead code elimination, loop peeling and unrolling, static correlated branch prediction, code layout, and a number of instruction schedulers.

The SUIF CFG library is available as part of the SUIF machine library distribution. A version of the machine library is available by anonymous ftp from ftp.eecs.harvard.edu in pub/hube (the same code is available from our web site, http://www.eecs.harvard.edu/~hube/). SUIF is available from http://suif.stanford.edu/. Questions, comments, and bug reports for this package should be e-mailed to machsuif-bugs@eecs.harvard.edu.

Acknowledgments

Archeological evidence suggests that the original CFG library, including dominator analysis, was written by Michael D. Smith, my advisor, while he was working on his thesis. A version of this library was modified by Bob Wilson at Stanford. Tony DeWitt brought the library to Harvard, adapting the data structure constructors to the new machine SUIF library. Gang Chen wrote the loop analysis code, and Mike built the unified data flow analysis routines. Other members of the HUBE research group at Harvard contributed useful suggestions to the design. Tim Callahan of Synopsis, Inc. and Berkeley debugged the library's support for low SUIF and fixed several other bugs.

During the period I worked on the CFG library, I have been supported by an NDSEG fellowship sponsored by the Office of Naval Research and by an IBM Cooperative Fellowship.

We also gratefully acknowledge the generous support of this research by Advanced Micro Devices, Digital Equipment, Hewlett-Packard, Intel, International Business Machines, and Microsoft.

References

[1] Brad Calder and Dirk Grunwald. ``Branch Alignment for Better Branch Prediction''. Proc. 6th Intl. Conf. on Architectural Support for Programming Languages and Operating Systems, pp. 242-252, October 1994.

[2] K. Pettis and R. C. Hansen. ``Profile Guided Code Positioning", Proc. SIGPLAN '90 Conf. on Prog. Lang. Design and Implementation, June 1990.

[3] Michael D. Smith. The SUIF Machine Library. The machine SUIF compiler documentation set, Harvard University, 1994.

[4] Stanford Compiler Group. The SUIF Library. The SUIF compiler documentation set, Stanford University, 1994.

[5] Cliff Young and Michael D. Smith. ``Improving the Accuracy of Static Branch Prediction Using Branch Correlation''. Proc. 6th Intl. Conf. on Architectural Support for Programming Languages and Operating Systems, pp. 232-241, October 1994.