We next examined the effect that changing the number of available push-cache servers had on push-caching. We used the same trace from the previous section, and once again we averaged together the results from simulating with the three different caching-strategies. We set push-threshold = 100, pages-to-push = 10, and lookup-strategy = hops. The results are shown in Figures and .
Figure: The effect of server-number on network traffic and primary host traffic. Note the discontinuity at the right of the server traffic graph; see the text for an explanation.
Figure: The effect of server-number on average server traffic and active push-cache servers. Notice that as the numbers of available servers increases the percentage of those servers in use decreases.
The simulation results are unremarkable, and confirm our hypothesis; namely the more servers available, the lower the bandwidth, but that not all the servers available would be used. The discontinuity on the right of the primary host traffic graph is interesting; it demonstrates push-caching's sensitivity to object distribution. The dip was caused by the simulation run that used randomly selected pushing and the hops metric to pull. The primary server randomly pushed several objects to another push-cache server early in the simulation. That push-cache server was closer to most other clients than the primary host, and therefore most client requests originally directed to the primary host were subsequently redirected to that secondary host. With more time available we would have averaged together additional runs to reduce the inconsistency.
Figure demonstrates that network traffic is reduced as more servers are made available, even as the percentage of servers in use declines. This is because with more servers available the distance-based metrics can do a better job of selecting the optimal cache server given the current access pattern. This observation emphasizes the face that push-caching is very sensitive to network topology. It is also interesting to note that average server traffic remains essentially constant beyond 50 servers or so.
We conclude that increasing the number of servers improves the performance of push-caching, but caution that increasing the number of available servers also decreases the performance of the algorithm used to select where to replicate an object. Beyond several hundred servers or so it becomes infeasible to exhaustively search through them to determine the optimal server, since the computation time is in where is the number of available servers, and is the number of clients that have made requests. A constant factor that should not be ignored either is the cost of calculating the distance between two clients, since it will computed up to times. We will use server-number = 50 for the rest of the simulations in this section as a tradeoff between performance and computability.
As an implementation aside, this computation should obviously not be allowed to delay the handling of requests by the push-cache server and should be handled by a separate thread or process.
An area of future study is an examination of the various heuristics that could be used to narrow the set of servers under consideration. This heuristic would be used by the registry (see section ) to reduce the selection of available servers offered to the server about to replicate its objects. We predict that simple random selection will be most effective.