[BioC] limma small vs large number of samples

James W. MacDonald jmacdon at uw.edu
Tue Apr 29 16:20:01 CEST 2014


Hi Giovanni,

On 4/28/2014 8:54 PM, Giovanni Bucci wrote:
> Hello everybody,
>
> I have 32 samples, 4 factors with 2 levels each. Each level has 2
> replicates.
>
>> str(gxexprs)
>   num [1:15584, 1:32] 7.94 6.67 9.93 9.62 12.19 ...
>
>> Group
>   [1] R52VQ R52VQ R52VE R52VE R52EQ R52EQ R52EE R52EE R95VQ R95VQ R95VQ R95VE
> [13] R95VE R95VE R95EQ R95EQ R95EQ R95EE R95EE R95EE R97VQ R97VQ R97VQ R97VE
> [25] R97VE R97VE R97EQ R97EQ R97EQ R97EE R97EE R97EE
> 16 Levels: R52VQ R52VE R52EQ R52EE R95VQ R95VE R95EQ R95EE R97VQ ... R97EE
>
>
> design <- model.matrix(~0+Group)
> fit <- lmFit(gxexprs, design)
>
> contrast.matrix <- makeContrasts(contrasts="R52VQ - R52VE",levels=design)
> fit2 <- contrasts.fit(fit, contrast.matrix)
> fit2 <- eBayes(fit2)
>
> TTable = topTable(fit2)
>
> global_p_val = TTable$P.Val
>
>
> gxexprs = gxexprs[, 1:4]
>
>
> ## same code as above but the expression matrix has only the first 4
> columns which represent the contrast tested above
>
> design <- model.matrix(~0+Group)
> fit <- lmFit(gxexprs, design)
>
> contrast.matrix <- makeContrasts(contrasts="R52VQ - R52VE",levels=design)
> fit2 <- contrasts.fit(fit, contrast.matrix)
> fit2 <- eBayes(fit2)
>
> TTable = topTable(fit2)
>
> local_p_val = TTable$P.Val
>
> local_p_val has much greater values than global_p_val even though they
> represent the same comparison.
>
> What is the explanation for this?

The denominator of your t-statistic is based on the mean square error of 
the model (which is based on the intra-group variance of all groups). 
When you have all the other groups in the model, the number of 
observations used to estimate variance is larger, so you get more 
degrees of freedom for you test (and the variance estimate is more 
accurate), so you get smaller p-values in general.

Best,

Jim


>
> Can you point to some diagnostic functions that will show the difference?
>
> Thank you,
>
> Giovanni
>
> 	[[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

-- 
James W. MacDonald, M.S.
Biostatistician
University of Washington
Environmental and Occupational Health Sciences
4225 Roosevelt Way NE, # 100
Seattle WA 98105-6099



More information about the Bioconductor mailing list