next up previous
Next: Web-Trace Analysis Up: Autonomous Replication in Wide-Area Previous: Internet Analysis



Client-based caching is popular on the Web because it is easy to implement, and it provides significant bandwidth and latency savings. We are investigating server-initiated caching because there is, unfortunatly, an inherent limit to the amount of bandwidth and server load reduction possible with client cahing.

Even if every Web site was using a Web proxy, it would still be possible for a site to become swamped if a large number of proxies try to access a specific object at the same time. This is not a rare condidtion; a current Web service known as the Cool Site of the Day is very popular [13]. Each day it lists a ``cool'' Internet site that subsequently receives so much traffic that it often becomes swamped. Server initiated caching alleviates this problem by autonomously replicating the ``cool'' site's data when load at that site dramatically increases. Push-caching complements client-based caching by helping to disperse server load; together the two can provide more efficient network transport of data than either can provide independently.

Optimal server initiated caching is only feasible on the current Internet if it is possible to derive reasonably accurate network topology information from the chaotic, unordered Internet. As Guyton and Schwartz showed in section gif, it is impossible to directly derive this information efficiently from the Internet itself. One goal of this work is therefore to demonstrate that geographical distance, which is easy to derive, predicts topological distance. We also show that a dynamic replication protocol like push-caching is needed because simpler caching solutions, such as a static mirroring of popular Web sites, do not suffice.

James Gwertzman
Wed Apr 12 00:26:11 EDT 1995