[BioC] Survey for genetics labs: how do you store & manage your data?

Matthew Keller mckellercran at gmail.com
Thu Jan 24 20:20:33 CET 2008

Hello all,

I am an asst. professor at CU Boulder and a fellow at the Institute
for Behavioral Genetics (IBG). I am writing this email on behalf of my
colleagues here at IBG. We are in the process of trying to determine
the best way forward for storing large databases (100, 500, and 1000K
SNP chip data and sequencing data coming) in a way that a) is easily
accessed by researchers here (over an intranet preferably), b) is
relational to other databases, and c) can be easily pulled into
existing statistical packages like R (and Bioconductor) or SAS. We
have a meeting to discuss how to proceed next Friday.

For anyone who has the time, we would truly appreciate feedback. How
does your lab deal with data distribution and storage of large
databases? If you use a relational database, what language (eg, SQL...
are there any others?) and specific system (eg., SQLite, Oracle,
PostgreSQL etc) does your lab use? Do you find these interact nicely
with your statistical programs? Do you hire a full time staff member
in charge of database issues? Answers to these questions, as well as
any discussion of pluses or minuses of different approaches, or
additional words of wisdom, would be greatly appreciated.


Matthew Keller

Matthew C Keller
Asst. Professor of Psychology
University of Colorado at Boulder

More information about the Bioconductor mailing list