Contact Information

Zhigang Hu

zhigangh@us.ibm.com

Tel: 914-945-2395

David Brooks
dbrooks@eecs.harvard.edu

Tel: 617-495-3989

Pradip Bose

pbose@us.ibm.com

Tel: 914-945-3478

Proposed Length of Tutorial: ½ day (4 hours).

 

Reference list

Tutorial slides

 

Presenters:
Zhigang Hu, IBM T. J. Watson Research Center
David Brooks, Harvard University
Pradip Bose, IBM T. J. Watson Research Center

Abstract/Description:

In the early stages of processor design, as in general microarchitecture research, the use of trace- or execution-driven simulators is well established in the field.  These simulators are taken (largely on faith) to be “cycle-accurate” and until recently were focused exclusively on architectural performance, measured in cycles-per-instruction (CPI) or its inverse, IPC.  In the academic world,
the most widely used such super scalar processor simulator is SimpleScalar
[1]; another example is RSIM [2].  SMT-SIM [3] is an example of a well-known
simulator that has built-in support for simultaneous multithreading.  Recently, power extensions to such academic simulators, like Wattch [4] have become available.  SimplePower [5] is another example of an academic power-performance simulator.  A recently available temperature model extension, called HotSpot [6], that can be used in conjunction with simulators like SimpleScalar/Wattch is also of increasing popularity.  While many industrial R&D groups have their own performance and power-performance simulator infrastructures -- these are usually proprietary and not available for use in academic research.  

One issue that repeatedly comes up in the context of research that gets published in the leading architecture conferences, is that of model validation and accuracy (e.g. [7, 8]).  A related issue is that of the choice of (micro)benchmarks in presenting analysis of ideas using such simulators (e.g. [7,8,9]).  The tradeoffs between accuracy and speed of simulation is intuitively understood but not adequately quantified in abstractions that the researcher or industrial designer makes -- either in modeling or in the choice of input workloads and their driving data input sets.

In this tutorial, we propose to cover all issues related to modeling, analysis and accuracy of such early-stage (pre-silicon) power-performance simulators.  We will provide a review of prior art, while focusing primarily on an industrial-strength, PowerPC based power-performance simulation infrastructure, called PowerTimer [10, 11] that is currently in its validation phase, prior to release for use by university research groups.  The base performance simulator, called Turandot/MET [12, 13] was previously validated against a pre-RTL reference model for the POWER4 processor [14] and is already available for use on request by academic research groups.  This simulator has recently been upgraded to support simultaneous multithreading (SMT). The detailed topics covered in this tutorial are:

  1. Basic concepts and methods in trace- or execution-driven simulation,
    with examples from currently available simulators (like SimpleScalar and
    MET/Turandot).

  2. Basic concepts and methods in power and temperature modeling, with
    examples from Wattch PowerTimer and HotSpot.

  3. Analysis of power, temperature and inductive noise (Ldi/dt) using
    simulators like Wattch/Hotspot and PowerTimer.  Examples of how such
    simulators can be used in a pre-silicon setting to provide concrete impact
    on processor design choices in an industrial setting will be covered.

  4. Calibration and validation of power-performance simulators, with
    analysis of relative versus absolute accuracy; specific new methodology
    used in the PowerTimer toolset will be covered in detail after reviewing
    prior practices in this area.  The issue of (micro)benchmarking and workload validation will also be touched upon.

Summary Reference List:

  1. D. Burger and T. M. Austin, “The SimpleScalar Toolset, Ver. 2.0,”
    Computer Architecture News, Vol. 25, No. 3, June 1997, pp. 13-25;
    http://www.simplescalar.org

  2. RSIM: http://rsim.cs.uiuc.edu/rsim/

  3. D. M. Tullsen, “Simulation and Modeling of a Simultaneous
    Multithreading Processor, Proc. 22nd Annual Computer Measurement Group Conference, December, 1996. http://www.cs.ucsd.edu/users/tullsen/smtsim.html

  4. Brooks, V. Tiwari, M. Martonosi, “Wattch: a framework for
    architecture-level power analysis and optimizations,” Proc. 27th Ann.,
    Int’l. Symp. on Computer Architecture (ISCA), pp. 83-94, 2000.

  5. N. Vijaykrishnan et al., “Energy-driven integrated hardware-software
    optimizations using SimplePower,” Proc. 27th Ann. Int’l. Symp. on Computer Architecture (ISCA), 2000, pp. 95-106.

  6. K. Skadron et al., “Temperature-aware microarchitecture,” Proc. 30th
    Ann. Int’l. Symp. On Computer Architecture (ISCA), 2003, pp. 2-13. http://lava.cs.virginia.edu/HotSpot/

  7. R. Desikan, D. Burger and S. Keckler, “Measuring experimental error
    in microprocessor simulation,” Proc. 28th Ann. Int’l. Symp. on Computer
    Arch. (ISCA), June/July 2001, pp. 266-277.

  8. P. Bose, T. M. Conte and T. M. Austin, ed., Special issue of IEEE
    Micro on “Identifying design bugs: processor modeling and validation,” IEEE Micro, vol. 19, no. 3, May/June 1999.

  9. D. Citron, “MisSPECulation: partial and misleading use of SPEC
    CPU2000 in Computer Architecture Conferences,” invited panel position
    paper, Proc. 30th. Ann. Int’l. Symp. On Computer Architecture (ISCA), 2003, pp. 52-59.

  10. D. Brooks, J-D. Wellman, P. Bose and M. Martonosi, “Power-Performance
    Modeling and Tradeoff Analysis for a High End Microprocessor,” Workshop on Power-Aware Computer Systems (PACS-2000), held in conjunction with ASPLOS-IX, Nov. 2000.

  11. D. Brooks, P. Bose, V. Srinivasan, M. Gschwind, P. Emma, M.
    Rosenfield, “New methodology for early-stage, microarchitecture-level
    power-performance analysis of microprocessors,” to appear in IBM Journ. of Research and Development, Nov/Dec 2003.

  12. M. Moudgill, J-D Wellman, and J. H. Moreno, “Environment for PowerPC
    microarchitecture exploration,” IEEE Micro, vol. 19, no. 3, May/June 1999,
    pp. 15-25.

  13. M. Moudgill, P. Bose and J. Moreno, “Validation of Turandot, a fast
    processor model for microarchitecture exploration,” Proc. IEEE Int’l.
    Performance, Computing and Communication Conf., 1999, pp. 451-457.
    14. J. M. Tendler, J. S. Dodson, J. S. Fields, Jr, H. Le, B. Sinharoy,
    “POWER4 system microarchitecture,” vol. 46, no. 1, 2002.
    http://www.research.ibm.com/journal/rdpip.html

Presenter Biography:

Zhigang Hu is a Research Staff Member at IBM T. J. Watson Research Center.  He is the lead developer and contact for the most current version of IBM’s MET/Turandot PowerPC simulator.  Dr. Hu received his B.S. (1995) degree from the University of Science and Technology of China (USTC), his M.A. (1998) degree from Chinese Academy of Sciences (CAS), and his Ph.D (2002) degree in Electrical Engineering from Princeton University.  While at Princeton he
was a member of Prof. Margaret Martonosi’s power-aware computer
architecture group, working on a new time-based design methodology and its
application to power reduction in microprocessors, as well as performance
enhancement through cache prefetching.  At IBM, he continues to work in that
same field, while maintaining collaborative efforts with Prof. David Brooks’ group at Harvard and Prof. Martonosi’s group at Princeton.  Email contact: zhigangh@us.ibm.com

David Brooks is an Assistant Professor at Harvard University. Dr. Brooks
received his B.S. (1997) degree from the University of Southern California
and his M.A. (1999) and Ph.D (2001) degrees from Princeton University, all
in Electrical Engineering.  Prior to joining Harvard University as an Assistant Professor of Computer Science, Dr. Brooks was a Research Staff Member at IBM T. J. Watson Research Center.  His research interests include architectural-level power-modeling and power-efficient design of hardware
and software for embedded and high-performance computer systems.  He is the
original developer of the Wattch power models currently in use with SimpleScalar; these were developed as part of his Ph.D research supervised
by Prof. Margaret Martonosi at Princeton University.  Dr. Brooks has been
involved in prior tutorials given at ISCA, HPCA and Sigmetrics.  Personal web page: http://www.eecs.harvard.edu/~dbrooks

Pradip Bose is a Research Staff Member at IBM T. J. Watson Research Center, where he currently leads a project on power-aware microarchitectures.  Dr.Bose received his B.Tech (Hons.) Degree in Electronics and Electrical
Communication Engineering from Indian Institute of Technology (I.I.T) Kharagpur in 1977; and his M.S. and Ph.D degrees in Electrical and Computer Engineering from University of Illinois, Urbana-Champaign in 1981 and 1983 respectively.  His research interests include computer architecture, power-performance modeling and validation. He is actively involved in numerous conference committees and he has started several new workshops and conference in the field of computer architecture and performance evaluation, including ISPASS (http://ispass.org), which started as an ISCA workshop series on performance analysis and its impact on design (PAID), and the currently offered ISCA workshop series on complexity effective design (WCED).  Dr. Bose has been involved in many prior tutorial and workshop offerings at all the major architecture and performance
conferences (like ISCA, MICRO, HPCA and Sigmetrics) and is currently the
editor-in-chief of IEEE Micro magazine.  He is a senior member of IEEE.
Personal web page: http://www.research.ibm.com/people/b/bose

Note: There are other contributors to the power modeling methodology whose
work influence the PowerTimer project at IBM including: Alper Buyuktosunoglu, Viji Srinivasan, Scott Neely, Hans Jacobson and several other summer interns.