[BioC] A question about Limma

Naomi Altman naomi at stat.psu.edu
Fri Jan 7 15:36:31 CET 2005


The dye effect is likely to be significant for only some genes (after 
normalization) because the dye is a chemical reagent that binds to the cDNA 
or RNA.
The binding properties of the 2 dyes differ, and the chemical compositions 
of all cDNA fragments are different.  Sorry I cannot give more exact chemistry
info but that is the reason stripped of the technicalities.

The reason I put "after normalization" is that scanner settings and how the 
labeling was performed can affect the mean detection over the entire array, 
but the mean effect (over all genes) is remove during normalization.  So, 
what we are calling the "dye-effect" is called the "dye by gene 
interaction" in some papers such as Churchill and Kerr, 2001.

--Naomi


At 04:31 PM 1/6/2005 -0800, Fangxin Hong wrote:
>Is it possible that dye-effect is still tested to be significant for some
>genes, let's say 40% of genes? Do we remove or keep this effect for all
>genes?
>I met this problem,  the factor origin (like differnent laboratories)was
>significant for > 50% genes, what I did was keeping it in the model for
>all genes.
>However, I can't figure out a nice explanation of this, like why dye
>effect is only significant for 40% of genes, what does this tell us about
>this effect.
>
>Thanks.
>Fangxin
> > I agree.  In my reply to Fangxin I should have added that I would remove a
> > non-essential effect
> > like  a dye-effect if it appeared non-significant, but I'd remove it for
> > all the genes.
> >
> > Gordon
> >
> > On Tue, January 4, 2005 1:18 am, Naomi Altman said:
> >> Reducing the model based on removing nonsignificant effects is called
> >> "pre-test estimation".  It is known to increase the false-positive rate,
> >> even in the classical setting.  In the microarray setting, there is no
> >> compelling reason to use pre-test estimators that differ from gene to
> >> gene.
> >>
> >> --Naomi Altman
> >>
> >> At 10:57 PM 1/3/2005 +1100, Gordon K Smyth wrote:
> >>> > Date: Sun, 2 Jan 2005 14:05:15 -0800 (PST)
> >>> > From: "Fangxin Hong" <fhong at salk.edu>
> >>> > Subject: [BioC] A question about Limma
> >>> > To: bioconductor at stat.math.ethz.ch
> >>> > Message-ID: <1867.66.75.240.64.1104703515.squirrel at 66.75.240.64>
> >>> > Content-Type: text/plain;charset=iso-8859-1
> >>> >
> >>> > Hi Bioconductor users;
> >>> > I have a general question about limma model.
> >>> > In limma package, usually one linear model applies to all genes, and
> >>> error
> >>> > variances from all genes are modified simultaneously. What if some
> >>> > factors, for example, one main effect, is only significant for some
> >>> genes.
> >>> > Then if we want identify genes based on the significance of another
> >>> main
> >>> > effect (of interest). What is the best way to do it? Currently I juse
> >>> > leave this factor in the model which is applied to all genes,
> >>>
> >>>That's what I do, leave all terms in the models for all the genes.  I
> >>>don't see a strong case for
> >>>doing a separate model selection process for every gene.
> >>>
> >>> > but this
> >>> > might under-estimate the total number of genes on which the effect of
> >>> > interest is significant.
> >>>
> >>>Why do you think so?  The only disadvantage of keeping a non-significant
> >>>term in the model is a
> >>>reduction in residual degrees of freedom, with some consequential loss
> >>> of
> >>>power, but this
> >>>disadvantage is mitigated by the empirical Bayes moderation process.
> >>>
> >>>Perhaps someday someone will work out a model selection theory for
> >>>massively parallel regression
> >>>situations like microarray experiments, but there isn't such a theory
> >>>now.  It seems safer to me
> >>>to have the same model for every gene, keeping all the 'a priori'
> >>>important predictors in the
> >>>model.
> >>>
> >>>Gordon
> >>>
> >>> > I am sorry if this question has been asked/answered here before, I
> >>> > wouldn't find it through searching the archive. Any comment,
> >>> suggestion or
> >>> > experience is appreciated.
> >>> >
> >>> > Fangxin
> >>> > --
> >>> > Fangxin Hong, Ph.D.
> >>> > Plant Biology Laboratory
> >>> > The Salk Institute
> >>> > 10010 N. Torrey Pines Rd.
> >>> > La Jolla, CA 92037
> >>> > E-mail: fhong at salk.edu
> >>>
> >>>_______________________________________________
> >>>Bioconductor mailing list
> >>>Bioconductor at stat.math.ethz.ch
> >>>https://stat.ethz.ch/mailman/listinfo/bioconductor
> >>
> >> Naomi S. Altman                                814-865-3791 (voice)
> >> Associate Professor
> >> Bioinformatics Consulting Center
> >> Dept. of Statistics                              814-863-7114 (fax)
> >> Penn State University                         814-865-1348 (Statistics)
> >> University Park, PA 16802-2111
> >>
> >
> >
> >
>
>
>--
>Fangxin Hong, Ph.D.
>Plant Biology Laboratory
>The Salk Institute
>10010 N. Torrey Pines Rd.
>La Jolla, CA 92037
>E-mail: fhong at salk.edu

Naomi S. Altman                                814-865-3791 (voice)
Associate Professor
Bioinformatics Consulting Center
Dept. of Statistics                              814-863-7114 (fax)
Penn State University                         814-865-1348 (Statistics)
University Park, PA 16802-2111



More information about the Bioconductor mailing list