[BioC] odd limma results?

James W. MacDonald jmacdon at med.umich.edu
Tue Jan 11 18:24:20 CET 2011


Hi David,

On 1/11/2011 11:02 AM, David A. wrote:
>
> Hi list,
>
> I have a one-color setup with 238 samples and 6 conditions, for a geneset of just over 400 probes (targeted array). One of the conditions is control and the rest are experimental conditions. After normalization using vsn I would like to make a comparison first between the control and the rest and then between some of the conditions grouped. Being human samples I can expect great variability within conditions and thus not many genes coming as significant. But when I run a limma analysis, for some of the contrasts I get that all the genes come up significant with large B values and really small p-values. When doing boxplots of normalized expression values for some of these genes, the image doesn´t show they are that different, rather the opposite. Is my parametrization wrong? Are the results strangely significant because the reduced geneset to be analyzed? Being a targeted array probably many results should come up as significant, but I don't think all of them should show up
 as significant as they do for many of the contrasts.
> here is my code:
>
>> table(myeset$group)
>
>   0  1  2  3  4  5
> 68 60 18 53  6 33
>
>> design<- model.matrix(~0+factor(myeset$group))
>> colnames(design)<- c("G0","G1","G2","G3","G4","G5")
>
>> fit<- lmFit(myeset,design)
>> contrast.matrix<- makeContrasts((G1+G2+G3+G4+G5)-G0,(G1+G2+G3+G4)-G0, (G1+G2)-G0, G1-G2,(G1+G2+G3+G4)-G5, levels = design)

Most of these aren't contrasts. The definition of a contrast is that the 
coefficients should sum to 0. So if we look at e.g. the first 'contrast' 
above, it is equivalent to

(1*G1 + 1*G2 + 1*G3 + 1*G4 + 1*G5) - 1*G0

and the coefficients are thus

1 + 1 + 1 + 1 + 1 - 1, which isn't zero. So you need to divide each of 
the sums you create by the number of coefficients so the math works out.

Another way to look at this is to realize that your first question is 
'Is the average expression of samples G1-G5 different from G0?', which 
indicates again that you want the average of the estimated expression of 
the G1-G5 samples, not the sum.

Does that make sense?

Best,

Jim



>> fit2<- contrasts.fit(fit, contrast.matrix)
>> fit3<- ebayes(fit2)
>> topTable(fit3,adjust="BH",coef=1)
>               ID    logFC  AveExpr        t       P.Value     adj.P.Val        B
> 411        VTI2 39.30167 13.08281 149.2822 5.690975e-249 2.356064e-246 501.3006
> 315        PI14 38.07841 12.54843 138.1938 1.399494e-240 2.896952e-238 489.1714
> 49        CHRM1 35.60063 11.93787 135.3310 2.623964e-238 3.621070e-236 485.7677
> 67        DDX32 34.95446 11.61260 132.9274 2.310411e-236 2.391276e-234 482.8165
> 144 GI_18044224 35.16818 11.83657 132.1551 9.904483e-236 8.200912e-234 481.8495
> 385        TDE1 34.80442 11.65140 131.8907 1.633346e-235 1.127009e-233 481.5163
> 136 GI_16878151 35.98769 12.01306 130.9807 9.205817e-235 5.444583e-233 480.3612
> 269        MGLL 37.82152 12.80129 129.2519 2.539646e-233 1.168237e-231 478.1303
> 120 GI_15559367 37.07777 12.19571 130.4492 1.285421e-233 6.652055e-232 478.0643
> 61       CSNK1E 35.45598 11.93302 128.1286 2.243467e-232 9.287953e-231 476.6548
>
>
> Thanks for your help
>
> D.
>
>   		 	   		
> 	[[alternative HTML version deleted]]
>
>
>
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

-- 
James W. MacDonald, M.S.
Biostatistician
Douglas Lab
University of Michigan
Department of Human Genetics
5912 Buhl
1241 E. Catherine St.
Ann Arbor MI 48109-5618
734-615-7826
**********************************************************
Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues 



More information about the Bioconductor mailing list