[BioC] How do you keep track of your analyses

Sam Hunter biosam at gmail.com
Thu Sep 25 07:54:07 CEST 2008


I like to keep a main working directory, then make a sub directory for
each project with a directory name containing date, contact name, and
project title.  This makes it easy to see when I started a project,
and keeps things sorted chronologically:

/working
/working/2008-02-29-Contact_project

Inside of each project folder I keep a separate folder for each step
of the analysis, with R version so that I know if data files are old
or not:

/working/2008-02-29-Contact_project/qa_2.7
/working/2008-02-29-Contact_project/filter_2.7
/working/2008-02-29-Contact_project/sam_2.7
/working/2008-02-29-Contact_project/go_2.7
/working/2008-02-29-Contact_project/pathway_2.7

I keep a script file for each step within each folder, and store an
RData object with normalized data in the project folder so that it is
easily loaded from any sub-folder.  I also try to keep a log of all
analysis in the project folder so that I can remember what I did.

I use Linux and KDE, so Kate (http://kate-editor.org/) is my editor of
choice because it is fast, lightweight and supports R syntax
highlighting.  It also allows you to open many files at once and store
"sessions" which makes it easy to automatically open many R scripts
for big projects.

I backup the folder weekly using Flyback
(http://code.google.com/p/flyback/) which supports incremental and
automatic backups at the click of a button.

Sam

On Tue, Sep 23, 2008 at 8:51 AM, Daniel Brewer <daniel.brewer at icr.ac.uk> wrote:
> Hello,
>
> I am doing an increasing number of bioconductor analyses for various
> people and I am starting to find it difficult to keep track of what I
> have done previously.  A common question six months after the initial
> analysis is something like "Can you do the same as x but change y".  Has
> anyone got any idea on the best way to do this.
>
> The essential components to keep track of are:
> * input files
> * R code used
> * output files
> * Description of what you aim to do.
>
> The two possibilities that I can think of is:
> 1) Some structured directories e.g.
> ProjectName_Person
>        /Description.txt
>        /Analysis1_date
>                /InputFiles
>                /Rcode
>                /Output/Outputfiles
>
> 2) Some sort of personal wiki like TiddlyWiki
>
> Its got to be searchable in some form too.
>
> Any experiences in this realm?
>
> Many thanks
>
> --
> **************************************************************
> Daniel Brewer, Ph.D.
>
> Institute of Cancer Research
> Molecular Carcinogenesis
> Email: daniel.brewer at icr.ac.uk
> **************************************************************
>
> The Institute of Cancer Research: Royal Cancer Hospital, a charitable Company Limited by Guarantee, Registered in England under Company No. 534147 with its Registered Office at 123 Old Brompton Road, London SW7 3RP.
>
> This e-mail message is confidential and for use by the a...{{dropped:2}}
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>



More information about the Bioconductor mailing list