|
We investigated how operating system design
should be adapted for multithreaded chip
multiprocessors (CMT) -- a new generation of
processors that exploit thread-level parallelism to mask
the memory latency in modern workloads. We
determined that the L2 cache is a critical shared
resource on CMT and that an insufficient amount of L2
cache can undermine the ability to hide memory latency
on these processors. To use the L2 cache as efficiently
as possible, we propose an L2-conscious scheduling
algorithm and quantify its performance potential.
Using this algorithm it is possible to reduce miss ratios
in the L2 cache by 25-37% and improve processor
throughput by 27-45%. |