[BioC] How does limma t-test work?
James W. MacDonald
jmacdon at uw.edu
Fri Aug 31 15:08:46 CEST 2012
Hi Jorge,
On 8/30/2012 5:33 PM, Jorge Miró wrote:
> Hi,
>
> I am using the limma package for an analysis of differential
> expression and have a question about how the t-test in limma works. To
> my understanding a difference between the usuall t-test and the one
> used in limma is that the standard error in limma is calculated by
> using a linear method based on a Bayesian model (I don't really get
> how it works ).
>
> In the users guide of limma it says that "This has the same
> interpretation as an ordinary t-statistic except that the standard
> errors have been moderated across genes, i.e., shrunk towards a common
> value, using a simple Bayesian model. This has the eect of borrowing
> information from the ensemble of genes to aid with inference about
> each individual gene". What exactly does it mean to borrow information
> from other genes? Is it for example the standard error of a gene on
> different arrays than the ones been compared or the standard error of
> all other genes in the same arrays being compared that is being used
> in the calculations?
It is based on all other genes on the array you are using. The rationale
for doing this stems from the fact that the sample variance is not an
efficient statistic, which means that it takes a certain number of
observations before the sample variance converges towards the true
underlying variance that we are trying to estimate. In many microarray
analyses, we have far fewer observations than is really required to get
a good estimate of the variance, so we want to increase the precision of
this estimate.
One way to do that is to compute an expected variance that we think has
a higher probability of being representative of the true underlying
variance, and then adjust our observed values towards this expected
variance. This is what the eBayes() step does. It first computes an
'average' variance, based on all the genes on your array (which will be
more accurate because it is based on so much data). Then for each gene
we compute the sample variance, and then adjust that value towards the
expected variance that we computed from all genes.
Best,
Jim
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
--
James W. MacDonald, M.S.
Biostatistician
University of Washington
Environmental and Occupational Health Sciences
4225 Roosevelt Way NE, # 100
Seattle WA 98105-6099
More information about the Bioconductor
mailing list