[BioC] necessity of moderated t statistic and false discoveries for small predefined gene list?
Moshe Olshansky
olshansky at wehi.EDU.AU
Thu May 17 06:35:28 CEST 2012
Hi Rich,
I think that Gordon Smyth (the author of limma) has explained at this list
what moderated t-statistic is.
The brief explanation is that when there are few samples the estimate of
the variance which is used in a standard t-test is quite noisy and because
one must account for this noise the standard t-test has a low statistical
power. The Empirical Bayes model used in the moderated t-tests allows to
estimate the variance with more confidence and therefore has a better
power. So it can be used even if you are interested in just a few genes.
It has (almost) nothing to do with the multiple testing adjustment. Well,
one may ask whether moderated p-values satisfy the assumptions of multiple
testing adjustment procedures (in particular the BH), but this is another
story. May be Gordon will comment on this.
Best regards,
Moshe.
> Moshe and List,
>
> Thanks for yoru reply. The method you describe retains
> the raw p-value based on the moderated t-statistic and adjusts
> it to give an adjusted p-value (usually a false discovery rate).
> However, as I understand it, the moderated
> t-statistic given by Limma based on
> all of the genes in the array, pools variance information
> to moderate the standard deviation to prevent fortuitously
> low p-values stemming from fortuitously low standard deviations
> encountered in thousands of multiple tests.I am wondering
> that if the experimentalist asks me to look up just 10 genes
> I should use the unmoderated frequentist t-statistic which
> will differ from the one in Limma and may imply significance
> where Limma does not. I guess another way to phrase it is
> "How many simulataneous tests does one need before one
> should prefer the moderated statistic to the empirical
> Bayesian one". Or should I fit just those 10 genes
> (~30 affy probes) with Limma?
>
> Best wishes,
> Rich
>
>
>
> On Thu, 17 May 2012, Moshe Olshansky wrote:
>
>> Hi Rich,
>>
>> Whether to use the moderated t-statistic or not does not depend on
>> whether
>> you are interested in the 10 particular genes or in all differentially
>> expressed ones. This will affect your multiple testing adjustment.
>> The simplest way for you to proceed is to use limma as usual, get the
>> topTable but then take the UNADJUSTED p-values for your 10 genes of
>> interest and use the p.adjust function to adjust for multiple testing if
>> you wish. In any case you should also look at (log)Fold Changes.
>>
>> Best regards,
>> Moshe.
>>
>>
>>> Dear Bioconductor List.
>>>
>>> I am using Limma to analyze differential expression between 2
>>> conditions on an Affy chip.
>>> My experimental collaborator asks for the differential expression of
>>> 10 predefined genes.
>>>
>>> A, Should I correct for false discoveries based upon all of the genes
>>> on the chip?
>>> B. If not, should I correct for false discoveries just for the
>>> probeids for the 10 predefined
>>> genes?
>>> C. Should I use the moderated t-statistic or just use an unmoderated t-
>>> test for those 10
>>> genes.
>>>
>>> Thanks and best wishes,
>>> Rich
>>> ------------------------------------------------------------
>>> Richard A. Friedman, PhD
>>> Associate Research Scientist,
>>> Biomedical Informatics Shared Resource
>>> Herbert Irving Comprehensive Cancer Center (HICCC)
>>> Lecturer,
>>> Department of Biomedical Informatics (DBMI)
>>> Educational Coordinator,
>>> Center for Computational Biology and Bioinformatics (C2B2)/
>>> National Center for Multiscale Analysis of Genomic Networks (MAGNet)
>>> Room 824
>>> Irving Cancer Research Center
>>> Columbia University
>>> 1130 St. Nicholas Ave
>>> New York, NY 10032
>>> (212)851-4765 (voice)
>>> friedman at cancercenter.columbia.edu
>>> http://cancercenter.columbia.edu/~friedman/
>>>
>>> "School is an evil plot to suppress my individuality"
>>>
>>> Rose Friedman, age15
>>>
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at r-project.org
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives:
>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>
>>
>>
>>
>
> --
> ------------------------------------------------------------
> Richard A. Friedman, PhD
> Associate Research Scientist
> Herbert Irving Comprehensive Cancer Center
> Biomedical Informatics Shared Resource
> Lecturer
> Department of Biomedical Informatics
> Box 95, Room 130BB or P&S 1-420C
> Columbia University Medical Center
> 630 W. 168th St.
> New York, NY 10032
> (212)305-6901 (5-6901) (voice)
> friedman at cancercenter.columbia.edu
> http://cancercenter.columbia.edu/~friedman/
>
> "The last 250 pages of the last Harry Potter
> book took place in one day because alot
> happened in that day. All of Ulysses takes
> place in one day and nothing happened in that day."
> -Rose Friedman, age 11
>
>
______________________________________________________________________
The information in this email is confidential and intend...{{dropped:4}}
More information about the Bioconductor
mailing list