[BioC] A question about Limma

Fangxin Hong fhong at salk.edu
Fri Jan 7 23:27:16 CET 2005


> The dye effect is likely to be significant for only some genes (after
> normalization) because the dye is a chemical reagent that binds to the
> cDNA
> or RNA.
> The binding properties of the 2 dyes differ, and the chemical compositions
> of all cDNA fragments are different.  Sorry I cannot give more exact
> chemistry
> info but that is the reason stripped of the technicalities.
That makes sense.


> The reason I put "after normalization" is that scanner settings and how
> the
> labeling was performed can affect the mean detection over the entire
> array,
> but the mean effect (over all genes) is remove during normalization.
Will normalizaion completely remove the mean effect of dye? I didn't tudy
spot cDNA array much, but for affy arrays, normalization can't remove
effect like labs completely. I normalized data generated at two labs
(similar experimental setting) together, but limma model test that the
mean effect which is lab here is still significant for >50 genes, with
lab*treatment interaction included.
dye effect is reasonably be "dye-by-gene interaction", but lab effect
seems should be mean effect for all genes. Why normalizaion still can't
remove it completely?

Thanks
Fangxin

>  So,
> what we are calling the "dye-effect" is called the "dye by gene
> interaction" in some papers such as Churchill and Kerr, 2001.
>
> --Naomi
>
>
> At 04:31 PM 1/6/2005 -0800, Fangxin Hong wrote:
>>Is it possible that dye-effect is still tested to be significant for some
>>genes, let's say 40% of genes? Do we remove or keep this effect for all
>>genes?
>>I met this problem,  the factor origin (like differnent laboratories)was
>>significant for > 50% genes, what I did was keeping it in the model for
>>all genes.
>>However, I can't figure out a nice explanation of this, like why dye
>>effect is only significant for 40% of genes, what does this tell us about
>>this effect.
>>
>>Thanks.
>>Fangxin
>> > I agree.  In my reply to Fangxin I should have added that I would
>> remove a
>> > non-essential effect
>> > like  a dye-effect if it appeared non-significant, but I'd remove it
>> for
>> > all the genes.
>> >
>> > Gordon
>> >
>> > On Tue, January 4, 2005 1:18 am, Naomi Altman said:
>> >> Reducing the model based on removing nonsignificant effects is called
>> >> "pre-test estimation".  It is known to increase the false-positive
>> rate,
>> >> even in the classical setting.  In the microarray setting, there is
>> no
>> >> compelling reason to use pre-test estimators that differ from gene to
>> >> gene.
>> >>
>> >> --Naomi Altman
>> >>
>> >> At 10:57 PM 1/3/2005 +1100, Gordon K Smyth wrote:
>> >>> > Date: Sun, 2 Jan 2005 14:05:15 -0800 (PST)
>> >>> > From: "Fangxin Hong" <fhong at salk.edu>
>> >>> > Subject: [BioC] A question about Limma
>> >>> > To: bioconductor at stat.math.ethz.ch
>> >>> > Message-ID: <1867.66.75.240.64.1104703515.squirrel at 66.75.240.64>
>> >>> > Content-Type: text/plain;charset=iso-8859-1
>> >>> >
>> >>> > Hi Bioconductor users;
>> >>> > I have a general question about limma model.
>> >>> > In limma package, usually one linear model applies to all genes,
>> and
>> >>> error
>> >>> > variances from all genes are modified simultaneously. What if some
>> >>> > factors, for example, one main effect, is only significant for
>> some
>> >>> genes.
>> >>> > Then if we want identify genes based on the significance of
>> another
>> >>> main
>> >>> > effect (of interest). What is the best way to do it? Currently I
>> juse
>> >>> > leave this factor in the model which is applied to all genes,
>> >>>
>> >>>That's what I do, leave all terms in the models for all the genes.  I
>> >>>don't see a strong case for
>> >>>doing a separate model selection process for every gene.
>> >>>
>> >>> > but this
>> >>> > might under-estimate the total number of genes on which the effect
>> of
>> >>> > interest is significant.
>> >>>
>> >>>Why do you think so?  The only disadvantage of keeping a
>> non-significant
>> >>>term in the model is a
>> >>>reduction in residual degrees of freedom, with some consequential
>> loss
>> >>> of
>> >>>power, but this
>> >>>disadvantage is mitigated by the empirical Bayes moderation process.
>> >>>
>> >>>Perhaps someday someone will work out a model selection theory for
>> >>>massively parallel regression
>> >>>situations like microarray experiments, but there isn't such a theory
>> >>>now.  It seems safer to me
>> >>>to have the same model for every gene, keeping all the 'a priori'
>> >>>important predictors in the
>> >>>model.
>> >>>
>> >>>Gordon
>> >>>
>> >>> > I am sorry if this question has been asked/answered here before, I
>> >>> > wouldn't find it through searching the archive. Any comment,
>> >>> suggestion or
>> >>> > experience is appreciated.
>> >>> >
>> >>> > Fangxin
>> >>> > --
>> >>> > Fangxin Hong, Ph.D.
>> >>> > Plant Biology Laboratory
>> >>> > The Salk Institute
>> >>> > 10010 N. Torrey Pines Rd.
>> >>> > La Jolla, CA 92037
>> >>> > E-mail: fhong at salk.edu
>> >>>
>> >>>_______________________________________________
>> >>>Bioconductor mailing list
>> >>>Bioconductor at stat.math.ethz.ch
>> >>>https://stat.ethz.ch/mailman/listinfo/bioconductor
>> >>
>> >> Naomi S. Altman                                814-865-3791 (voice)
>> >> Associate Professor
>> >> Bioinformatics Consulting Center
>> >> Dept. of Statistics                              814-863-7114 (fax)
>> >> Penn State University                         814-865-1348
>> (Statistics)
>> >> University Park, PA 16802-2111
>> >>
>> >
>> >
>> >
>>
>>
>>--
>>Fangxin Hong, Ph.D.
>>Plant Biology Laboratory
>>The Salk Institute
>>10010 N. Torrey Pines Rd.
>>La Jolla, CA 92037
>>E-mail: fhong at salk.edu
>
> Naomi S. Altman                                814-865-3791 (voice)
> Associate Professor
> Bioinformatics Consulting Center
> Dept. of Statistics                              814-863-7114 (fax)
> Penn State University                         814-865-1348 (Statistics)
> University Park, PA 16802-2111
>
>
>
>


-- 
Fangxin Hong, Ph.D.
Plant Biology Laboratory
The Salk Institute
10010 N. Torrey Pines Rd.
La Jolla, CA 92037
E-mail: fhong at salk.edu



More information about the Bioconductor mailing list