Layering in Provenance Systems

TitleLayering in Provenance Systems
Publication TypeConference Paper
Year of Publication2009
AuthorsMuniswamy-Reddy, Kiran-Kumar, Braun Uri, Holland David A., Macko Peter, Maclean Diana, Margo Daniel, Seltzer Margo, and Smogor Robin
Conference Name2009 USENIX Annual Technical Conference
Date PublishedJune 2009
Conference LocationSan Diego, California
Keywordsfilesystem, network attached storage, PASS, provenance
Abstract

Digital provenance describes the ancestry or history of a digital object. Most existing provenance systems, however, operate at only one level of abstraction: the system call layer, a workflow specification, or the high-level constructs of a particular application. The provenance collectable in each of these layers is
different, and all of it can be important. Single-layer systems fail to account for the different levels of abstraction at which users need to reason about their data and processes. These systems cannot integrate data provenance across layers and cannot answer questions that require an integrated view of the provenance.

We have designed a provenance collection structure facilitating the integration of provenance across multiple levels of abstraction, including a workflow engine, a web browser, and an initial runtime Python provenance tracking wrapper. We layer these components atop provenance-aware network storage (NFS) that builds upon a Provenance-Aware Storage System (PASS). We discuss the challenges of building systems that integrate provenance across multiple layers of abstraction, present how we augmented systems in each layer to ntegrate provenance, and present use cases that demonstrate how provenance spanning multiple layers provides functionality not available in existing systems.Our evaluation shows that the overheads imposed by layering provenance systems are reasonable.

URLhttp://www.eecs.harvard.edu/syrah/pass/pubs/usenix09.pdf