[Bioc-devel] history mechanism
Robert Gentleman
rgentlem at fhcrc.org
Tue Sep 6 20:15:37 CEST 2005
Hi,
Thanks to Kevin, Vince, Kasper, and others for raising this issue and
for some useful ideas.
One thing I think we should not do, is to try and reimplement the
general compendium concept here. If complete provenance from raw data
file to finished analysis is what is wanted then the compendium concept
provides a reasonable mechanism for doing that. Other things and
improvements are possible to that, but I don't think it will be
worthwhile to try and force this sort of generality into a history
mechanism. For most (all really) data analysis that I do now, I use the
compendium approach.
However, eSets are some form of self-documenting data structure and it
is worthwhile to improve that documentation.
Some general ideas for comment/consideration
1) it would be nice to capture some input into a history mechanism
into every instance that documents what happened to it
2) some objects (such as eSet and exprSet) are compound - they have
phenoData and exprs and other more complex objects are possible. I
suspect that each of these needs to keep its own history
3) we could try to catch something from every call to a Replace
method, but it is not always easy to know what. We could add an argument
- and thereby "force" developers to pass the information down
One problem with this is that we want to support [[<-, [<- and $<-,
and these already have well defined signatures that we cannot easily change.
We can try to capture the call that was made to the function that is
changing the eSet (but how do we know that is the important one? If the
developer uses helper functions we sometimes need to look further up the
call stack. I know of no way to solve this generally).
Proposal:
--------
I suggest that we might add a history slot, which contains the
history to each object that we want to collect history on. We allow for
an optional history parameter for replacement functions and ask the
developers to use this to tell us the important call, and if it is not
present we can get a call, automatically (using sys.parent) from the
calling function, it will not be perfect, but would give us a start.
I do not expect that we can ever replay the history (it would need
much more information and I think we end up duplicating what is in the
compendium concept).
Any comments - recommendations, other improvements etc.?
Robert
--
Robert Gentleman, PhD
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M2-B876
PO Box 19024
Seattle, Washington 981029-1024
206-667-7700
rgentlem at fhcrc.org
More information about the Bioc-devel
mailing list