Performance of Multithreaded Chip Multiprocessors And Implications For Operating System Design

Alexandra Fedorova, Margo Seltzer (Harvard University)
Christopher Small, Daniel Nussbaum (Sun Microsystems)


We investigated how operating system design should be adapted for multithreaded chip multiprocessors (CMT) -- a new generation of processors that exploit thread-level parallelism to mask the memory latency in modern workloads. We determined that the L2 cache is a critical shared resource on CMT and that an insufficient amount of L2 cache can undermine the ability to hide memory latency on these processors. To use the L2 cache as efficiently as possible, we propose an L2-conscious scheduling algorithm and quantify its performance potential. Using this algorithm it is possible to reduce miss ratios in the L2 cache by 25-37% and improve processor throughput by 27-45%.