|
There is a great deal of research about improving Web server
performance and building better, faster servers, but little research
in characterizing servers and the load imposed upon them. While
some tremendously popular and busy sites, such as netscape.com,
playboy.com, and altavista.com, receive several million hits per
day, most servers are never subjected to loads of this magnitude.
This paper presents the analysis of internet Web server logs for
a variety of different types of sites. We present a taxonomy of
the different types of Web sites and characterize their access
patterns and, more importantly, their growth. We then use our server
logs to address some common perceptions about the Web. We show
that, on a variety of sites, contrary to popular belief, the use
of CGI does not appear to be increasing and that long latencies
are not necessarily due to server loading. We then show that, as
expected, persistent connections are generally useful, but that
dynamic time-out intervals may be unnecessarily complex and that
allowing multiple persistent connections per client may actually
hinder resource utilization compared to allowing only a single
persistent connection.
|