[BioC] bioconductor

Paul Grosu Grosu at cgr.harvard.edu
Wed Nov 19 17:00:33 MET 2003


Hi -

I just signed up to the list and don't know if anyone posted this already
but I'm providing a link to Chis Bye's "A Biologist's Guide to using
Bioconductor" - it's not long:

http://www.fas.harvard.edu/~grosu/downloads/bioconductor_manual.pdf

This is in no way will cover the completeness of Greg's list but might be a
good starting point.

Paul Grosu
Bioinformatician
Bauer Center for Genomics Research/Harvard University

-----Original Message-----
From: Warnes, Gregory R [mailto:gregory_r_warnes at groton.pfizer.com] 
Sent: Wednesday, November 19, 2003 10:39 AM
To: 'rossini at u.washington.edu'; Roger Vallejo
Cc: bioconductor at stat.math.ethz.ch
Subject: RE: [BioC] bioconductor


We talked about this at the BioCBUG meeting a couple of weeks ago.  The web
site does have clear instructions for *installing* Bioconductor, it is just
not clear what to do once it is installed.

I think that the necessary documentation is available, but it is fragmented:

1) It is not clear from the web site what documentation you need to read to
get started.
2) None of the vignettes that I've looked at show a complete analysis
session from start to finish.
[I think the reason for this is that the people writing the vignettes are
the *package* authors and they have slightly different interests from
*consumers*]

I would suggest 

1) Adding a topic on the front page and on the navbar "Getting Started with
Bioconductor" that brings up a page with a small number of vignettes titled
like:

	Getting Started with Affymetrix Data
	Getting Started with Custom Two-Channel Data
	Getting Started with XXXXX Data
	...

These vignettes should go through an common-case example analysis from start
to finish.  From my work the flow should be something like this for
Affymetrix data:

1) Prerequisites 
	- Software: R, Bioconductor	
	- Data: CEL files, experiment information
	- install the required CDF package

2) Load the data
	
3) Perform standard (technology) Quality Control tests
	- 3'/5' ratios
	- Chip images
	- RNA digestion plots

4) Normalize/scale/standardize the data

5) Perform 'overall' visualizations
	- MDS and PCA for samples using all probesets

6) Apply a statistical model to all probesets
	- ANOVA / ANCOVA
	- Contrasts

7) Apply multiple comparison correction (FDR, ...)
	
8) Filter based on statistical model
	- Select probesets with FDR < 0.05
	[Note that I didn't metion filtering earlier, I think it is a bad
idea to 
       filter before applying a model!]

9) Add annotation 

10) Generate visualizations
	- PCA/MDS for samples using statistically significant genes
	- Profile plots across experimental conditions / treatments
	- heatmap including 2-way hirarchical clustering

11) Generate tabulations
	- Table of top XX results from statistical model with subset of
annotation

12) Generate output dataset for interactive visualization in
Spotfire/Excel/...
	- All results from statistical model with all annotation


For the getting started document I would recommend giving the *simplest*
good-practice method of accomplishing each task.  Each section should also
include a pointer to other documents that can provide further details on how
the alogrithms work / alternative commands / etc.

-Greg

> -----Original Message-----
> From: rossini at blindglobe.net [mailto:rossini at blindglobe.net]
> Sent: Wednesday, November 19, 2003 9:51 AM
> To: Roger Vallejo
> Cc: bioconductor at stat.math.ethz.ch
> Subject: Re: [BioC] bioconductor
> 
> 
> 
> You aren't being helpful or explicit.  3-4 hours doing what?  What
> exactly have you read?  How do you expect us to suggest things when
> you don't tell us what you've done?  
> 
> 
> But more importantly, have you tried
> 
> library(tkWidgets)
> vExplorer()
> 
> and looked at the affy vignettes? 
> 
> 
> 
> "Roger Vallejo" <rvallejo at psu.edu> writes:
> 
> > DO you have a manual that shows how to learn to use BIOCONDUCTOR?
> >
> > I have spent 3-4 hrs and I see only lots of bla bla bla but 
> any direct
> > instructions on how to start loading affy genechip data and 
> performing
> > rudimentary microarray data analysis.
> >
> > Many thanks in advance for the help..
> >
> > Roger
> >
> >  
> >
> >  
> >
> > Roger L. Vallejo, Ph.D.
> >
> > Assist. Professor of Genomics/Bioinformatics
> >
> > The Pennsylvania State University
> >
> > Department of Dairy & Animal Science
> >
> > Genomics & Bioinformatics Laboratory
> >
> > 305 Henning Building
> >
> > University Park, PA 16802
> >
> > Phone:        (814) 865-1846 
> >
> > Fax:            (814) 863-6042
> >
> > Email:         rvallejo at psu.edu <mailto:rvallejo at psu.edu> 
> >
> > Website:     http://genomics.cas.psu.edu/ 
<http://genomics.cas.psu.edu/>
>
>
>  
>
>
> 	[[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor
>

-- 
rossini at u.washington.edu            http://www.analytics.washington.edu/ 
Biomedical and Health Informatics   University of Washington
Biostatistics, SCHARP/HVTN          Fred Hutchinson Cancer Research Center
UW (Tu/Th/F): 206-616-7630 FAX=206-543-3461 | Voicemail is unreliable
FHCRC  (M/W): 206-667-7025 FAX=206-667-4812 | use Email

CONFIDENTIALITY NOTICE: This e-mail message and any attachme...{{dropped}}

_______________________________________________
Bioconductor mailing list
Bioconductor at stat.math.ethz.ch
https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor


LEGAL NOTICE\ Unless expressly stated otherwise, this messag...{{dropped}}

_______________________________________________
Bioconductor mailing list
Bioconductor at stat.math.ethz.ch
https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor



More information about the Bioconductor mailing list