Provenance is the lineage or derivation history of an object. There are a number
of environments in which researchers work that would become more powerful
research tools of they tracked data provenance of the outputs that they
created. There are a number of ways to attack this project. One would be to
take an existing environment (e.g., R) and make it provenance-aware.
A second approach is to construct a provenance-aware environment for a
particular set of researchers. For example, there is an active research
group at Harvard conducting research on gorilla populations.
They would find it quite useful to have a provenance-aware environment in
which to conduct all their different analyses.
|Source Code Control as Provenance
This is related to the previous project.
Source code control systems such as CVS, Mercurial, subversion, etc
all capture some form of provenance, but it is accessible only through
the version control metaphor. Pick your favorite system and make the
provenance queriable. Conduct a user-study to see what features are
most useful to users.
A research paper today is a particular expression of a set of ideas, data,
programs, graphs, images, etc. There is much discussion about creating
documents that contain not only the specific expression, but all the data
and programs that back the expression, thus allowing "readers" to become
active participants in the research. For this project, one might begin
with Open Office and modify it to produce a publication production system
that allows a researcher to include text, programs, images, data, and all
the elements of the research that back a particular project.
|Proof Carrying Facebook