SOS Project Traces

Traces Available on SNIA web site

You can now download the anonymized traces from The Historical portion of the SNIA trace repository

Available Traces

Many of the traces described in our work are available in an anonymized form -- however, not all of the traces are available, due to confidentiality requirements, and some of the traces are not available in their entirety because we simply don't have the disk space necessary to store them on-line.

These traces were gathered with nfsdump, which is available for download.

There are five sets of long-term traces available:

lair62b (aka "EECS03")

A trace of a resarch workload from a university computer science department.

The trace runs 2/1/2003 - 3/14/2003, and is about 11GB in size.

deasnab (aka "DEAS03")

A trace of a general workload from the division of engineering and applied sciences -- a mix of email and research workloads.

The deasna trace runs 1/29/2003 through 3/10/2003 is about 35GB in size.

lair62 (aka "EECS")

A trace of a resarch workload from a university computer science department.

The trace runs 9/1/2001 - 11/30/2001, and is about 9.5GB in size.

home02 (aka "CAMPUS")

A trace from the main campus general-purpose servers. Predominantly email.

The home02 trace runs 9/1/2001 - 11/30/2001, and is about 48GB in size.

home02 is only one of the file systems hosted on the general-purpose servers. A trace of home03 (a less busy file system) is also available for the same time period, and is about 32GB in size. There are also some shorter traces (a week or several days) of several other file systems. All of these traces are available.

deasna (aka "DEAS")

A trace of a general workload from the division of engineering and applied sciences -- a mix of email and research workloads.

The deasna trace runs 10/17/2002 through 11/22/2002 is about 26GB in size.

The deasna2 and lair62b traces include more information than the other traces (the collector was improved to gather the euid and egid for each call, and the return code for access was fixed). The deasna2 is considered to be the most general workload in terms of diversity of users and applications.

Getting a Copy of the Traces

Our goal is to make the traces available to all interested researchers.

The traces are too large to share all of them over the web, at least with our current setup. Our preferred method of sharing our data is for people to send us large disks, which we fill with data and then return to them.

We have made three traces available over the web: home02, lair62b, and deasnab (aka deasna2). Please contact us for access. Do not attempt to download the traces in parallel: our network admins will get angry and we will be forced to return to mail-only access.

At this time we can only accept large ATA/IDE (not Serial ATA) drives. It is much more convenient for us to manage one or two large drives instead of several small drives. (Please do not send us a crate full of old 2GB drives.)

We have put the first hour of DEAS03 up so that you can have a look to see if it would really be useful to you. Please look at this sample and the FAQ before considering sending us a disk for the traces (or downloading them).

If you decide to use the traces in a paper, please cite the appropriate paper (usually Dan's Ph.D. thesis or the FAST paper).

Contact Daniel Ellard or Jonathan Ledlie for more information about acquiring or contributing traces.


The SOS Project <sos@eecs.harvard.edu>