This paper presents the design, implementation, and performance of
the Harvard Array of Clustered Computers (HACC), a cluster-based
design for scalable, cost-effective web servers. HACC is designed
for locality enhancement. Requests that arrive at the cluster are
distributed among the nodes so as to enhance the locality of reference
that occurs on individual nodes in the cluster. By improving locality
on individual cluster nodes, we can reduce their working set sizes
and achieve superior performance for less cost than conventional
approaches. We implemented HACC on Windows NT 4.0 and evaluated its
performance for both static documents and workloads of dynamically
generated documents adapted from logs of commercial web servers.
Our performance results show that HACC's locality enhancement can
improve performance by up to 121% for our stochastically generated
static file case, by up to 40% for our trace-based static file case,
and by up to 52% for our trace-based dynamic document case, compared
to an IP-Sprayer approach to building cluster-based web servers.