|
The quality of file system benchmarking has not improved
in over a decade of intense research spanning
hundreds of publications. Researchers repeatedly use a
wide range of poorly designed benchmarks, and in most
cases, develop their own ad-hoc benchmarks. Our community
lacks a definition of what we want to benchmark
in a file system. We propose several dimensions of file
system benchmarking and review the wide range of tools
and techniques in widespread use. We experimentally
show that even the simplest of benchmarks can be fragile,
producing performance results spanning orders of
magnitude. It is our hope that this paper will spur serious
debate in our community, leading to action that can
improve how we evaluate our file and storage systems.
|