We close the results section by comparing proxy-caching to
push-caching. As we saw in section
, one proxy
vendor claimed that large proxy caches can have hit rates of up to
65%. We wanted to evaluate their performance, expecting proxy-caches
to perform even more efficiently than the simple client-caching
described above. Proxy-caches are also a rudimentary form of
hierarchical caching since multiple computers share one cache. By
comparing proxy-caching to push-caching we are also therefore
comparing simple hierarchical caching to push-caching.
As we discussed in section
, our simulator
currently maps hosts from the trace data onto 1700 simulated hosts. In
the case of the NCSA trace data, for example, this involves mapping
50,000 hosts onto 1700. This means that on average there are 29 hosts
mapped to each one of the simulated hosts. If we allow these mapped
hosts to satisfy their requests from documents cached in the simulated
host then we are simulating the effects of a Web proxy.
This is not a completely fair comparison; our simulation is actually biased in favor of proxy-caching. Our trace data is server-centric; it was taken from a single server. To assume that all the hosts connected to a given proxy-cache are accessing data from the same server favors the proxy cache, and therefore the results from this section should not be considered realistic, but rather optimal for proxy-caching. They are useful as a benchmark against which to consider push-caching because in the real-world push-caching will only do better when compared against proxy-caching. This is because in the real world clients would be asking proxy-caches for documents from many servers, not just a single server, and not all servers will be as globally popular as the NCSA server.
Figure: Proxy-caching vs. Push-caching. The NCSA trace was used in this
simulation. Notice that proxy-caching saves far more
bandwidth than push-caching alone, and that adding push-caching to
proxy-caching does not significantly increase performance.
Figure
displays a simulation of the NCSA
server trace. As expected, optimal proxy-caching is extremely
efficient in terms of bandwidth, saving 60% over the original
bandwidth consumption. Miles-based push-caching saves only 14%, and
the optimal push-caching case, hops-based push-caching, only 26% of
the bandwidth. Combining push-caching with proxy-caching only adds at
most an additional 2% bandwidth reduction over proxy-caching alone.
Table
presents more detail comparing the
two traces. The efficiency column is the ratio of bandwidth savings
over cache space required; the metric discussed in
section
and used above. Braun and Claffy had
simulated geographical caching and found at most an efficiency of 7
with their simulations. We found proxy-caching to be more efficient,
with an efficiency of 21. However, push-caching is extremely
efficient, with an order of magnitude increase over proxy-caching for
similar server load savings. These efficiency calculations do not take
into account the primary host which stores 200 times more objects than
the average cache.
These results are easily explained: proxy-caching achieves its bandwidth savings by caching a copy of every document requested through the proxy regardless of whether the file is popular or not. Push-caching is far more selective, only replicating those files that are known to be popular. Since these files make up the majority of the requests to the primary host, push-caching is able to distribute the server's load far more efficiently than proxy-caching.
Table: Detailed examination of proxy-caching versus push-caching for the
NCSA server trace. Push-Caching is clearly far more efficient than
client-based caching for similar savings in primary host load.
Proxy-caching, however, saves significantly more network bandwidth
than push-caching in return for an order of magnitude decrease in
efficiency.