[BioC] EdgeR multi-factors testing questions

Gordon K Smyth smyth at wehi.EDU.AU
Thu Jan 30 01:14:09 CET 2014


Dear Yanzhu,

The culprit is your code here:

   glmFit(y,design,offset=offset  )

whereby you are feeding your own offset to glmFit() and hence overwriting 
any normalization that has been done.  Why are you doing this?  If you 
just remove the offset=offset argument and allow edgeR to work as normal, 
then the results will be correct.

Best wishes
Gordon

> Date: Tue, 28 Jan 2014 06:38:26 -0800 (PST)
> From: "Yanzhu [guest]" <guest at bioconductor.org>
> To: bioconductor at r-project.org, mlinyzh at gmail.com
> Subject: [BioC] EdgeR multi-factors testing questions
>
>
> Hi Ryan,
>
> The P-values are exactly identical, please see the example of the testing the three-way interaction (as below (II)):
>
> (I) I have checked the normalization factors for both methods,
> (i) TMM:
> head(y$sample$norm.factors)
> [1] 0.9404963 0.9056276 0.9928999 0.9961922 1.0109106 0.9115558
>
> (ii) UQ:
> head(y$sample$norm.factors)
> [1] 0.9883357 0.9625269 1.0414372 1.0143681 1.1134091 0.8992270
>
> (II) Example: the following is how I test the three-way interaction term:
> (i) TMM:
> group<-paste(L,S,R,sep=".")
> design<-model.matrix(~L+R+S+L:R+L:S+R:S+L:R:S)
> y<-DGEList(counts=counts,group=group)
> y<-calcNormFactors(y,method="TMM")
> offset=log((y$sample)[,2])
> y<-estimateGLMCommonDisp(y,design)
> y<-estimateGLMTagwiseDisp(y,design)
>
> fiteTMM_LRS<-glmFit(y,design,offset=offset  )
>
> ### testing the three-way interaction term L:R:S
> lrteTMM_LRS<-glmLRT(fiteTMM_LRS,coef=c(67:96))
>
> ### P-values:
> head((lrteTMM_LRS$table)[,33])
> [1] 1.233769e-30 5.648507e-30 4.254337e-06 9.304154e-05 8.504918e-01 1.075495e-41
>
> (ii) UQ:
> ##################### UQ
> group<-paste(L,S,R,sep=".")
> design<-model.matrix(~L+R+S+L:R+L:S+R:S+L:R:S)
> y<-DGEList(counts=counts,group=group)
> y<-calcNormFactors(y,method="upperquartile",p=0.75)
> offset=log((y$sample)[,2])
> y<-estimateGLMCommonDisp(y,design)
> y<-estimateGLMTagwiseDisp(y,design)
>
> fiteUQ_LRS<-glmFit(y,design,offset=offset  )
>
> ### testing the three-way interaction term L:R:S
> lrteUQ_LRS<-glmLRT(fiteUQ_LRS,coef=c(67:96))
>
> ### P-values:
> head((lrteUQ_LRS$table)[,33])
> [1] 1.233769e-30 5.648507e-30 4.254337e-06 9.304154e-05 8.504918e-01 1.075495e-41
>
> The P-values from both normalization methods are exactly identical, here I only show part of it.
>
>
> Thanks.
>
>
> Yanzhu
>
> -------------------------------------------------------
> I don't think it's surprising. TMM and upper quartile tend to give very
> similar normalization factors. Are you getting *exactly* identical
> P-values, or only very similar ones? Also, you can check the
> normalization factors themselves to see how much of a difference there
> is. If "d" is your DGEList object, you can get the normalization
> factors by typeing "d$samples$norm.factors.
>
> -Ryan
>
>
>
> -- output of sessionInfo():
>
> sessionInfo()
> R version 3.0.1 (2013-05-16)
> Platform: x86_64-w64-mingw32/x64 (64-bit)
>
> locale:
> [1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252    LC_MONETARY=English_United States.1252
> [4] LC_NUMERIC=C                           LC_TIME=English_United States.1252
>
> attached base packages:
> [1] parallel  stats     graphics  grDevices utils     datasets  methods   base
>
> other attached packages:
> [1] DESeq_1.12.1       lattice_0.20-15    locfit_1.5-9.1     Biobase_2.20.1     BiocGenerics_0.6.0 edgeR_3.2.4        limma_3.16.8
>
> loaded via a namespace (and not attached):
> [1] annotate_1.38.0      AnnotationDbi_1.22.6 DBI_0.2-7            genefilter_1.42.0    geneplotter_1.38.0   grid_3.0.1
> [7] IRanges_1.18.4       RColorBrewer_1.0-5   RSQLite_0.11.4       splines_3.0.1        stats4_3.0.1         survival_2.37-4
> [13] XML_3.98-1.1         xtable_1.7-1

______________________________________________________________________
The information in this email is confidential and intend...{{dropped:4}}



More information about the Bioconductor mailing list