[BioC] edgeR: adjusting prior.n
J [guest]
guest at bioconductor.org
Fri Feb 7 04:07:19 CET 2014
Hello Bioconductor mailing list,
I'm using edgeR to analyze RNA-seq and RIP-seq data. I'm using the moderated gene-wise dispersion method where I choose a prior.n based upon the number of samples I have and how much shrinkage I desire to reduce variance in my data. I've followed the R user guide and a guide a professor made at my university to achieve gene-wise (tag-wise) dispersion as opposed to common dispersion. I can see that the tag-wise dispersion changes for each gene when I change the prior.n value (see below). However I notice that that p-values and logFC values don't change when I change the prior.n (see below).
So I'm wondering if I'm doing something wrong or if the p-values and/or logFC are not supposed to change? Or is it just some data is not effected at all by prior.n?
The following is an example of my command line code analyzing some RNA-seq data. I illustrated an example where I used a prior.n = 1 or 4 to illustrate how I don't see a change in the final results. Is this correct?
Thanks
J
normFact = calcNormFactors(TotalReads)
treatments = substr(colnames(TotalReads), 1, 3)
d = DGEList(counts = TotalReads, group = treatments, genes = rownames(TotalReads), lib.size = colSums(TotalReads) * normFact)
d = estimateCommonDisp(d)
d$common.dispersion
[1] 0.1669584
#comparing prior.n 1 v. 4
d1 = estimateTagwiseDisp(d, prior.n = 1)
d1
$tagwise.dispersion
[1] 0.1607020 0.3674590 0.1058665 0.1080470 0.1545314
20684 more elements â¦
d4 = estimateTagwiseDisp(d, prior.n = 4)
d4
$tagwise.dispersion
[1] 0.1463446 0.4059551 0.1278219 0.1213169 0.2150826
20684 more elements ...
#prior.n = 1
edgeHSMvHSFd1 = exactTest(d1, pair = c("HSM", "HSF"), (common.disp = FALSE ))
head(edgeHSMvHSFd1)
An object of class "DGEExact"
$table
logFC logCPM PValue
ENSG00000000003 -0.07130289 6.290327 6.073496e-01
ENSG00000000005 -2.28060195 -4.439294 1.000000e+00
ENSG00000000419 0.25083103 4.119127 8.780027e-02
ENSG00000000457 -0.19453270 4.557886 8.468506e-02
ENSG00000000460 -0.05015655 1.770160 9.075606e-01
ENSG00000000938 0.92221495 4.526491 1.227442e-11
#prior.n = 4
edgeHSMvHSFd4 = exactTest(d4, pair = c("HSM", "HSF"), (common.disp = FALSE ))
head(edgeHSMvHSFd4)
An object of class "DGEExact"
$table
logFC logCPM PValue
ENSG00000000003 -0.07130289 6.290327 6.073496e-01
ENSG00000000005 -2.28060195 -4.439294 1.000000e+00
ENSG00000000419 0.25083103 4.119127 8.780027e-02
ENSG00000000457 -0.19453270 4.557886 8.468506e-02
ENSG00000000460 -0.05015655 1.770160 9.075606e-01
ENSG00000000938 0.92221495 4.526491 1.227442e-11
sum(edgeHSMvHSFd1$table[,1]>0)
[1] 12770
sum(edgeHSMvHSFd4$table[,1]>0)
[1] 12770
#same results for DE genes for both
sum(edgeHSMvHSFd1$table[,1]>0 & edgeHSMvHSFd1$table[,3]<0.01)
[1] 1763
sum(edgeHSMvHSFd4$table[,1]>0 & edgeHSMvHSFd4$table[,3]<0.01)
[1] 1763
-- output of sessionInfo():
> sessionInfo()
R version 2.15.0 (2012-03-30)
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods
[7] base
other attached packages:
[1] edgeR_2.6.2 limma_3.12.0
loaded via a namespace (and not attached):
[1] tools_2.15.0
--
Sent via the guest posting facility at bioconductor.org.
More information about the Bioconductor
mailing list