[BioC] Possible Bugs in Contrasts Design of edgeR
Yang Liu [guest]
guest at bioconductor.org
Thu Apr 25 17:59:21 CEST 2013
Hi,
I am a PhD student from Dept. Statistics of Penn State University. I have been using edgeR for my RNA-seq data recently. As I moved forward with edgeR, I found there are probably some bugs in the latest documentation.
My experimental design is exactly the same as the design (Comparisons Both Between and Within Subject) on page 32 in the latest documentation (last revised on 31 March 2013). From page 32 to 34, everything looks fine to me except for the contrasts defined at the end.
If I understand it properly, with the procedure showed on those pages, I can estimate each effect in the design, which can be showed with colnames(design). Then, all contrasts should be based on those estimated effects. For the first contrast on page 34, it is trying to find genes that respond differently to the hormone in disease1 vs healthy patients. The contrast looks as follows:
lrt <- glmLRT(fit, contrast = c(0, 0, 0, 0, 0, 0, 0, 0, 0, -1, 1, 0))
However, it looks not right to me, and it only put weights on the 10th and 11th effects ("-1" and "1"), which are corresponding to "DiseaseHealthy:TreatmentHormone" and "DiseaseDisease1:TreatmentHormone". As I see, the intercept (baseline) represents the combined effect from Healthy, Patient 1, and None Treatment. In order to find genes that respond differently to the hormone in disease1 vs healthy patients (regardless which patient it is), you could not leave the weights for all 4th to 9th effects as zeros. Also, as the contrast compares disease1 to healthy, the weight in the contrast for the 2nd effect should not be zero as well. I think the right contrast should be as follows:
lrt <- glmLRT(fit, contrast = c(0, 1, 0, -1/3, 1/3, 0, -1/3, 1/3, 0, -1, 1, 0))
Please feel free to correct me if I am wrong.
Thanks,
Yang Liu
-- output of sessionInfo():
R version 2.15.2 (2012-10-26)
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] edgeR_3.0.8 limma_3.14.4
--
Sent via the guest posting facility at bioconductor.org.
More information about the Bioconductor
mailing list