Eliminating Voltage Emergencies via Software-Guided Code Transformations

Vijay Janapa Reddi, Simone Campanoni, Meeta S. Gupta, Kim Hazelwood, Michael D. Smith, Gu-Yeon Wei, and David Brooks

ACM Transactions on Architecture and Code Optimization (TACO), 2010

In recent years, circuit reliability in modern high-performance processors has become increasingly important. Shrinking feature sizes and diminishing supply voltages have made circuits more sensitive to microprocessor supply voltage fluctuations. These fluctuations result from the natural variation of processor activity as workloads execute, but when left unattended, these voltage fluctuations can lead to timing violations or even transistor lifetime issues. In this paper, we present a hardware-software collaborative approach to mitigate voltage fluctuations. A checkpoint-recovery mechanism rectifies errors when voltage violates maximum tolerance settings, while a run-time software layer reschedules the program’s instruction stream to prevent recurring violations at the same program location. The run-time layer, combined with the proposed code rescheduling algorithm, removes 60% of all violations with minimal overhead, thereby significantly improving overall performance. Our solution is a radical departure from the ongoing industry standard approach to circumvent the issue altogether by optimizing for the worst case voltage flux, which compromises power and performance efficiency severely, especially looking ahead to future technology generations. Existing conservative approaches will have severe implications on the ability to deliver efficient microprocessors. The proposed technique reassembles a traditional reliability problem as a runtime performance optimization problem, thus allowing us to design processors for typical case operation by building intelligent algorithms that can prevent recurring violations.

[ Paper ]

 author = {Vijay Janapa Reddi and
           Simone Campanoni and 
           Meeta S. Gupta and 
           Michael D. Smith and 
           Gu-Yeon Wei and 
           David Brooks and 
           Kim Hazelwood},
 title = {Eliminating Voltage Emergencies via Software-guided Code Transformations},
 journal = {ACM Trans. Archit. Code Optim.},
 issue_date = {September 2010},
 volume = {7},
 number = {2},
 month = oct,
 year = {2010},
 issn = {1544-3566},
 pages = {12:1--12:28},
 articleno = {12},
 numpages = {28},
 url = {http://doi.acm.org/10.1145/1839667.1839674},
 doi = {10.1145/1839667.1839674},
 acmid = {1839674},
 publisher = {ACM},
 address = {New York, NY, USA},
 keywords = {Voltage noise, dI/dt, inductive noise, voltage emergencies},