We next examined the effect that changing the number of
available push-cache servers had on push-caching. We used the same
trace from the previous section, and once again we averaged together
the results from simulating with the three different
caching-strategies. We set push-threshold = 100, pages-to-push =
10, and lookup-strategy = hops. The results are shown
in Figures
and
.
Figure: The effect of
server-number on network traffic and primary host traffic. Note
the discontinuity at the right of the server traffic graph; see the
text for an explanation.
Figure: The effect of server-number on average server traffic and
active push-cache servers. Notice that as the numbers of available
servers increases the percentage of those servers in use
decreases.
The simulation results are unremarkable, and confirm our hypothesis; namely the more servers available, the lower the bandwidth, but that not all the servers available would be used. The discontinuity on the right of the primary host traffic graph is interesting; it demonstrates push-caching's sensitivity to object distribution. The dip was caused by the simulation run that used randomly selected pushing and the hops metric to pull. The primary server randomly pushed several objects to another push-cache server early in the simulation. That push-cache server was closer to most other clients than the primary host, and therefore most client requests originally directed to the primary host were subsequently redirected to that secondary host. With more time available we would have averaged together additional runs to reduce the inconsistency.
Figure
demonstrates that network traffic is
reduced as more servers are made available, even as the percentage of
servers in use declines. This is because with more servers available
the distance-based metrics can do a better job of selecting the
optimal cache server given the current access pattern. This
observation emphasizes the face that push-caching is very sensitive to
network topology. It is also interesting to note that average server
traffic remains essentially constant beyond 50 servers or so.
We conclude that increasing the number of servers improves the
performance of push-caching, but caution that increasing the number of
available servers also decreases the performance of the algorithm used
to select where to replicate an object. Beyond several hundred servers
or so it becomes infeasible to exhaustively search through them to
determine the optimal server, since the computation time is in
where
is the number of available servers, and
is
the number of clients that have made requests. A constant factor that
should not be ignored either is the cost of calculating the distance
between two clients, since it will computed up to
times. We will use server-number = 50 for the rest of the
simulations in this section as a tradeoff between performance and
computability.
As an implementation aside, this computation should obviously not be allowed to delay the handling of requests by the push-cache server and should be handled by a separate thread or process.
An area of future study is an examination of the various heuristics
that could be used to narrow the set of servers under
consideration. This heuristic would be used by the registry (see
section
) to reduce the selection of
available servers offered to the server about to replicate its
objects. We predict that simple random selection will be
most effective.