We close the results section by comparing proxy-caching to push-caching. As we saw in section , one proxy vendor claimed that large proxy caches can have hit rates of up to 65%. We wanted to evaluate their performance, expecting proxy-caches to perform even more efficiently than the simple client-caching described above. Proxy-caches are also a rudimentary form of hierarchical caching since multiple computers share one cache. By comparing proxy-caching to push-caching we are also therefore comparing simple hierarchical caching to push-caching.
As we discussed in section , our simulator currently maps hosts from the trace data onto 1700 simulated hosts. In the case of the NCSA trace data, for example, this involves mapping 50,000 hosts onto 1700. This means that on average there are 29 hosts mapped to each one of the simulated hosts. If we allow these mapped hosts to satisfy their requests from documents cached in the simulated host then we are simulating the effects of a Web proxy.
This is not a completely fair comparison; our simulation is actually biased in favor of proxy-caching. Our trace data is server-centric; it was taken from a single server. To assume that all the hosts connected to a given proxy-cache are accessing data from the same server favors the proxy cache, and therefore the results from this section should not be considered realistic, but rather optimal for proxy-caching. They are useful as a benchmark against which to consider push-caching because in the real-world push-caching will only do better when compared against proxy-caching. This is because in the real world clients would be asking proxy-caches for documents from many servers, not just a single server, and not all servers will be as globally popular as the NCSA server.
Figure: Proxy-caching vs. Push-caching. The NCSA trace was used in this simulation. Notice that proxy-caching saves far more bandwidth than push-caching alone, and that adding push-caching to proxy-caching does not significantly increase performance.
Figure displays a simulation of the NCSA server trace. As expected, optimal proxy-caching is extremely efficient in terms of bandwidth, saving 60% over the original bandwidth consumption. Miles-based push-caching saves only 14%, and the optimal push-caching case, hops-based push-caching, only 26% of the bandwidth. Combining push-caching with proxy-caching only adds at most an additional 2% bandwidth reduction over proxy-caching alone.
Table presents more detail comparing the two traces. The efficiency column is the ratio of bandwidth savings over cache space required; the metric discussed in section and used above. Braun and Claffy had simulated geographical caching and found at most an efficiency of 7 with their simulations. We found proxy-caching to be more efficient, with an efficiency of 21. However, push-caching is extremely efficient, with an order of magnitude increase over proxy-caching for similar server load savings. These efficiency calculations do not take into account the primary host which stores 200 times more objects than the average cache.
These results are easily explained: proxy-caching achieves its bandwidth savings by caching a copy of every document requested through the proxy regardless of whether the file is popular or not. Push-caching is far more selective, only replicating those files that are known to be popular. Since these files make up the majority of the requests to the primary host, push-caching is able to distribute the server's load far more efficiently than proxy-caching.
Table: Detailed examination of proxy-caching versus push-caching for the NCSA server trace. Push-Caching is clearly far more efficient than client-based caching for similar savings in primary host load. Proxy-caching, however, saves significantly more network bandwidth than push-caching in return for an order of magnitude decrease in efficiency.