[BioC] Drosophila GeneChip analysis

Arne.Muller at aventis.com Arne.Muller at aventis.com
Fri May 14 18:31:02 CEST 2004


Hi Paul,

this makes things more clear but I still have some question ..., please see below for some comments.

> -----Original Message-----
> From: Paul Mack [mailto:paulmack at arches.uga.edu]
> Sent: 14 May 2004 18:07
> To: Muller, Arne PH/FR
> Subject: RE: [BioC] Drosophila GeneChip analysis
> 
> 
> Hi, Arne:
> 
> Thanks for your response. Hopefully I can clarify. I have 4 
> classification 
> variables in the model I use: gene; category (meaning treated 
> or control; I

This means you're running a single model, i.e. if you've 10,000 genes on the chip you've the gene factor contains 10,000 levels, right?
 
How are you running this in R, I've tried once, and it quickly run out of memory because I've >12k gene on the chip ... :-(

> have only one treatment and it is qualitative); array (designated as 
> random); and probe (there are 14 probes per gene on each 
> chip). It also

I'm also running linear models on the probe level (I think it gives a good kind of pseudo-replication).
 
what is your model call, something like this:

lme(intensity ~ gene*cat*probe, random = ~ 1 | array)

or do you also include the array in the fixed effects.

I'm not sure about this call (just received the mixed model book from Pinheiro and Bates).

> includes a category x array interaction term. The model predicts gene 
> expression as a function of array, categoy, array, probe and the 
                              ^^^^
                              you mean gene here?

> interaction term. I then look at the estimated category 
> coefficients gene 

I'm currently doing a similar thing, and despite the trouble to decide for a method to correct for multiple testing (I'm using p.adjust(pvalue, 'fdr')) and the actual p-value cutoff, I found that residuals of the models are not normal distributed (see one of my last postings to the Bioconductor list). I think one realy needs to check the model quality, otherwise the p-values don't mean too much anyway ... .

Kerr and Churchill (2002) have reported this problem, and argue that one actually needs to use bootstrapping to calculate condifence intervals (since the distribution of residuals has extreme tails). This is rather discouraging since bootstrapping will take too long for my analysis (MG-U74Av2 chip with >12k gene).

Did you try the mmanova package from Kerr & Churchill (http://www.jax.org/staff/churchill/labsite/software/anova/)? I'm not sure it works for affy chips.

     regards,

     Arne

> by gene. Hope this makes more sense.
> 
> Paul
> 
> At 04:28 PM 5/14/2004 +0200, you wrote:
> >do you have a factorial design, and you run one linear model 
> for each 
> >gene, and then looking at the p-values for the coefficients? 
> Could you 
> >give some more information about what you're doing, I'm not sure I 
> >understand ...?
> >
> >         regards,
> >
> >         Arne
> >
> >--
> >Arne Muller, Ph.D.
> >Toxicogenomics, Aventis Pharma
> >arne dot muller domain=aventis com
> >
> > > -----Original Message-----
> > > From: bioconductor-bounces at stat.math.ethz.ch
> > > [mailto:bioconductor-bounces at stat.math.ethz.ch]On Behalf 
> Of Paul Mack
> > > Sent: 14 May 2004 16:20
> > > Subject: [BioC] Drosophila GeneChip analysis
> > >
> > >
> > >
> > > I am in the midst of analyzing Affymetrix Drosophila GeneChip
> > > data using
> > > RMA such that separate regression lines are estimated for
> > > each gene. It was
> > > recommended to me that I use a p-value of .0001 as a cutoff
> > > for the effect
> > > estimates rather than try to apply Bonferroni or other 
> multiple test
> > > corrections. Lately, however, I have begun to wonder if
> > > others doing this
> > > sort of analysis use similar cutoffs and, in general, what
> > > others think
> > > about statistical stringency in this situation. Any help 
> will be most
> > > appreciated; I will summarize any replies that I get that are
> > > not sent
> > > directly to the list. Thank you.
> > >
> > >
> > > Paul Mack, Ph.D
> > > Department of Genetics
> > > University of Georgia
> > > Athens, GA
> > > USA
> > >
> > > 706-542-1578 (w)
> > > 706-542-3910 (fax)
> > > paulmack at arches.uga.edu
> > >
> > > _______________________________________________
> > > Bioconductor mailing list
> > > Bioconductor at stat.math.ethz.ch
> > > https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor
> > >
> 
> Paul Mack, Ph.D
> Department of Genetics
> University of Georgia
> Athens, GA
> USA
> 
> 706-542-1578 (w)
> 706-542-3910 (fax)
> paulmack at arches.uga.edu
> 
> 
> 
>



More information about the Bioconductor mailing list