[BioC] edgeR

Jahn Davik [guest] guest at bioconductor.org
Wed Jan 29 08:40:04 CET 2014


Hi,
I am using edgeR to analyse counts data from an experiment consisting of two genotypes sampled at five time points. Three biological replicates. I have used the 'Classic approach' and the 'GLM approach', and expected to get the same results either way. At least, I expected to find the same topTags irrespectively of approach. (But maybe that is wrong?)

Here is what I've done:

CLASSIC APPROACH:
> # Classic part
> 
> y<-DGEList(counts=x,group=Group)
> y$samples
      group lib.size norm.factors
E00R1   E00 57811138            1
E00R2   E00 51440945            1
E00R3   E00 48826680            1
E01R1   E01 45632509            1
E01R2   E01 58231864            1
E01R3   E01 52760716            1
E05R1   E05 56470353            1
E05R2   E05 49744068            1
E05R3   E05 46002201            1
E48R2   E48 40331851            1
E48R3   E48 52134771            1
E96R1   E96 49449978            1
E96R2   E96 44233768            1
E96R3   E96 57043515            1
J00R1   J00 58919765            1
J00R2   J00 40388557            1
J00R3   J00 54126457            1
J01R1   J01 57302722            1
J01R2   J01 56813843            1
J01R3   J01 46711117            1
J05R1   J05 64733652            1
J05R2   J05 48922602            1
J05R3   J05 28696513            1
J48R2   J48 49544429            1
J48R3   J48 42298884            1
J96R1   J96 51538248            1
J96R2   J96 66995708            1
J96R3   J96 38325422            1
> 
> levels(y$samples$group)
 [1] "E00" "E01" "E05" "E48" "E96" "J00" "J01" "J05" "J48" "J96"
> y<-calcNormFactors(y)


> # estimating dispersion
> # over all genes
> y<-estimateCommonDisp(y)
> 
> # genewise
> 
> y<-estimateTagwiseDisp(y)
> 
> plotBCV(y)
> 
> et<-exactTest(y, pair=c("E00","J00"))
> topTags(et)
Comparison of groups:  J00-E00 
                   logFC   logCPM        PValue           FDR
comp394477_c0 -15.422419 5.602823  0.000000e+00  0.000000e+00
comp413538_c1 -11.685657 4.694669  0.000000e+00  0.000000e+00
comp431566_c0 -11.615584 4.405613  0.000000e+00  0.000000e+00
comp404433_c0   8.790634 4.710831  0.000000e+00  0.000000e+00
comp412635_c0   6.325601 4.519383  0.000000e+00  0.000000e+00
comp431005_c0   4.235474 6.494423  0.000000e+00  0.000000e+00
comp373625_c1   6.893877 3.728574 3.134846e-288 9.760118e-285
comp421285_c2  -4.744062 5.016100 6.226410e-278 1.696230e-274
comp419056_c0  13.453985 2.801532 6.846340e-239 1.657879e-235
comp401947_c0   7.987398 2.858410 1.078798e-217 2.351132e-214
> 
> et<-exactTest(y, pair=c("E01","J01"))
> topTags(et)
Comparison of groups:  J01-E01 
                   logFC   logCPM        PValue           FDR
comp413538_c1 -14.570734 4.694669  0.000000e+00  0.000000e+00
comp431566_c0 -14.185398 4.405613  0.000000e+00  0.000000e+00
comp394477_c0 -14.137519 5.602823  0.000000e+00  0.000000e+00
comp404433_c0  10.181217 4.710831  0.000000e+00  0.000000e+00
comp373625_c1   9.738900 3.728574  0.000000e+00  0.000000e+00
comp412635_c0   7.077064 4.519383  0.000000e+00  0.000000e+00
comp431005_c0   4.393042 6.494423  0.000000e+00  0.000000e+00
comp421285_c2  -4.656943 5.016100 2.038437e-267 5.553212e-264
comp404433_c1  13.310478 3.132245 1.777238e-232 4.303681e-229
comp419056_c0  11.369324 2.801532 2.968126e-222 6.468734e-219
> 
> et<-exactTest(y, pair=c("E05","J05"))
> topTags(et)
Comparison of groups:  J05-E05 
                   logFC   logCPM        PValue           FDR
comp404433_c0  12.392403 4.710831  0.000000e+00  0.000000e+00
comp394477_c0 -11.251416 5.602823  0.000000e+00  0.000000e+00
comp413538_c1 -11.075882 4.694669  0.000000e+00  0.000000e+00
comp412635_c0   7.071984 4.519383  0.000000e+00  0.000000e+00
comp431005_c0   3.866607 6.494423 5.618431e-314 2.448962e-310
comp373625_c1   9.542258 3.728574 4.235325e-310 1.538411e-306
comp431566_c0 -11.206496 4.405613 1.889575e-289 5.883056e-286
comp421285_c2  -4.448176 5.016100 4.956948e-245 1.350396e-241
comp379820_c0  -6.741883 4.283112 4.759103e-221 1.152443e-217
comp401947_c0  12.325326 2.858410 2.865886e-201 6.245911e-198
> 
> et<-exactTest(y, pair=c("E48","J48"))
> topTags(et)
Comparison of groups:  J48-E48 
                   logFC   logCPM        PValue           FDR
comp413538_c1 -11.577416 4.694669 9.806810e-268 2.137296e-263
comp412635_c0   7.534573 4.519383 4.519394e-256 4.924784e-252
comp373625_c1  11.099259 3.728574 4.211513e-231 3.059524e-227
comp421285_c2  -5.694947 5.016100 8.898521e-225 4.848359e-221
comp369941_c0  19.008105 9.936255 1.581987e-216 6.895566e-213
comp431566_c0 -13.422262 4.405613 1.567587e-181 5.693999e-178
comp404433_c0  12.908316 4.710831 2.647072e-159 8.241469e-156
comp394477_c0 -11.329087 5.602823 8.649066e-159 2.356222e-155
comp423647_c1  -3.235352 5.607951 1.216586e-156 2.946030e-153
comp379820_c0  -6.545352 4.283112 1.001698e-152 2.183100e-149
> 
> et<-exactTest(y, pair=c("E96","J96"))
> topTags(et)
Comparison of groups:  J96-E96 
                   logFC   logCPM        PValue           FDR
comp413538_c1 -14.558952 4.694669  0.000000e+00  0.000000e+00
comp431566_c0 -14.309061 4.405613  0.000000e+00  0.000000e+00
comp373625_c1  10.996613 3.728574  0.000000e+00  0.000000e+00
comp412635_c0   6.482673 4.519383  0.000000e+00  0.000000e+00
comp421285_c2  -6.022731 5.016100  0.000000e+00  0.000000e+00
comp404433_c0  12.027340 4.710831 3.172936e-312 1.152516e-308
comp369941_c0  17.982161 9.936255 2.109743e-292 6.568535e-289
comp423647_c1  -3.434699 5.607951 7.129247e-256 1.942185e-252
comp431005_c0   3.384003 6.494423 7.465632e-247 1.807844e-243
comp394477_c0 -13.722197 5.602823 9.993279e-244 2.177935e-240
> 

MY INTENTION SO FAR HAS BEEN TO COMPARE THE GENOTYPES, E AND J, AT THE DIFFERENT TIME POINTS. 
SO FAR, SO GOOD (?)

MOVING OVER TO THE GLM APPROACH:

> z<-x
> z<-DGEList(counts=x,group=Group)
> z$samples
      group lib.size norm.factors
E00R1     E 57811138            1
E00R2     E 51440945            1
E00R3     E 48826680            1
E01R1     E 45632509            1
E01R2     E 58231864            1
E01R3     E 52760716            1
E05R1     E 56470353            1
E05R2     E 49744068            1
E05R3     E 46002201            1
E48R2     E 40331851            1
E48R3     E 52134771            1
E96R1     E 49449978            1
E96R2     E 44233768            1
E96R3     E 57043515            1
J00R1     J 58919765            1
J00R2     J 40388557            1
J00R3     J 54126457            1
J01R1     J 57302722            1
J01R2     J 56813843            1
J01R3     J 46711117            1
J05R1     J 64733652            1
J05R2     J 48922602            1
J05R3     J 28696513            1
J48R2     J 49544429            1
J48R3     J 42298884            1
J96R1     J 51538248            1
J96R2     J 66995708            1
J96R3     J 38325422            1
> 
> z<-calcNormFactors(z)
> z<-x
> z<-DGEList(counts=z,group=Group)
> z$samples
      group lib.size norm.factors
E00R1     E 57811138            1
E00R2     E 51440945            1
E00R3     E 48826680            1
E01R1     E 45632509            1
E01R2     E 58231864            1
E01R3     E 52760716            1
E05R1     E 56470353            1
E05R2     E 49744068            1
E05R3     E 46002201            1
E48R2     E 40331851            1
E48R3     E 52134771            1
E96R1     E 49449978            1
E96R2     E 44233768            1
E96R3     E 57043515            1
J00R1     J 58919765            1
J00R2     J 40388557            1
J00R3     J 54126457            1
J01R1     J 57302722            1
J01R2     J 56813843            1
J01R3     J 46711117            1
J05R1     J 64733652            1
J05R2     J 48922602            1
J05R3     J 28696513            1
J48R2     J 49544429            1
J48R3     J 42298884            1
J96R1     J 51538248            1
J96R2     J 66995708            1
J96R3     J 38325422            1
> 
> z<-calcNormFactors(z)
> Group<-factor(substring(colnames(z),1,1))
> Group
 [1] E E E E E E E E E E E E E E J J J J J J J J J J J J J J
Levels: E J
> 
> Time<-factor(substring(colnames(z),2,3))
> Time
 [1] 00 00 00 01 01 01 05 05 05 48 48 96 96 96 00 00 00 01 01 01 05 05 05 48 48 96 96 96
Levels: 00 01 05 48 96
> 
> design<-model.matrix(~Group+Group:Time)
> rownames(design)<-colnames(z)
> design
      (Intercept) GroupJ GroupE:Time01 GroupJ:Time01 GroupE:Time05 GroupJ:Time05 GroupE:Time48 GroupJ:Time48 GroupE:Time96 GroupJ:Time96
E00R1           1      0             0             0             0             0             0             0             0             0
E00R2           1      0             0             0             0             0             0             0             0             0
E00R3           1      0             0             0             0             0             0             0             0             0
E01R1           1      0             1             0             0             0             0             0             0             0
E01R2           1      0             1             0             0             0             0             0             0             0
E01R3           1      0             1             0             0             0             0             0             0             0
E05R1           1      0             0             0             1             0             0             0             0             0
E05R2           1      0             0             0             1             0             0             0             0             0
E05R3           1      0             0             0             1             0             0             0             0             0
E48R2           1      0             0             0             0             0             1             0             0             0
E48R3           1      0             0             0             0             0             1             0             0             0
E96R1           1      0             0             0             0             0             0             0             1             0
E96R2           1      0             0             0             0             0             0             0             1             0
E96R3           1      0             0             0             0             0             0             0             1             0
J00R1           1      1             0             0             0             0             0             0             0             0
J00R2           1      1             0             0             0             0             0             0             0             0
J00R3           1      1             0             0             0             0             0             0             0             0
J01R1           1      1             0             1             0             0             0             0             0             0
J01R2           1      1             0             1             0             0             0             0             0             0
J01R3           1      1             0             1             0             0             0             0             0             0
J05R1           1      1             0             0             0             1             0             0             0             0
J05R2           1      1             0             0             0             1             0             0             0             0
J05R3           1      1             0             0             0             1             0             0             0             0
J48R2           1      1             0             0             0             0             0             1             0             0
J48R3           1      1             0             0             0             0             0             1             0             0
J96R1           1      1             0             0             0             0             0             0             0             1
J96R2           1      1             0             0             0             0             0             0             0             1
J96R3           1      1             0             0             0             0             0             0             0             1
attr(,"assign")
 [1] 0 1 2 2 2 2 2 2 2 2
attr(,"contrasts")
attr(,"contrasts")$Group
[1] "contr.treatment"

attr(,"contrasts")$Time
[1] "contr.treatment"

> z<-estimateGLMCommonDisp(z,design)
> 
> # genewise
> z<-estimateGLMTrendedDisp(z,design)
> z<-estimateGLMTagwiseDisp(z,design)
> 
> fit<-glmFit(z,design)
> lrt<-glmLRT(fit,coef=2)
> topTags(lrt)
Coefficient:  GroupJ 
                   logFC   logCPM       LR        PValue           FDR
comp394477_c0 -15.320108 5.602823 1599.742  0.000000e+00  0.000000e+00
comp413538_c1 -11.668893 4.694669 2063.236  0.000000e+00  0.000000e+00
comp431566_c0 -11.598820 4.405613 1505.514  0.000000e+00  0.000000e+00
comp404433_c0   8.789010 4.710831 1614.360  0.000000e+00  0.000000e+00
comp412635_c0   6.325157 4.519383 1638.631  0.000000e+00  0.000000e+00
comp431005_c0   4.235436 6.494423 1572.862  0.000000e+00  0.000000e+00
comp421285_c2  -4.743959 5.016100 1273.102 7.917312e-279 2.464999e-275
comp373625_c1   6.892714 3.728574 1255.435 5.468549e-275 1.489769e-271
comp419056_c0  13.351686 2.801532 1103.131 6.893184e-242 1.669223e-238
comp401947_c0   7.983599 2.858410 1005.328 1.247665e-220 2.719160e-217
> 
> # testing the Group DE at 01h
> lrt<-glmLRT(fit,contrast=c(0,0,-1,1,0,0,0,0,0,0))
> topTags(lrt)
Coefficient:  -1*GroupE:Time01 1*GroupJ:Time01 
                  logFC   logCPM        LR       PValue          FDR
comp401050_c0  3.493540 4.889617 169.62289 8.943982e-39 1.949251e-34
comp427887_c0  2.494577 6.359053 151.68090 7.439934e-35 8.107296e-31
comp429669_c1  3.461429 7.202470 139.75242 3.015465e-32 2.190635e-28
comp425652_c1 -3.747270 5.578305 121.36647 3.176789e-28 1.467673e-24
comp431260_c0 -1.510660 6.227693 121.25102 3.367149e-28 1.467673e-24
comp428540_c0  2.458811 4.676311  95.33553 1.607026e-22 5.837254e-19
comp437641_c0  1.729965 5.735428  89.12381 3.708599e-21 1.010451e-17
comp415376_c0 -3.768543 1.566064  89.12355 3.709096e-21 1.010451e-17
comp419207_c0 -2.096465 7.000344  88.31376 5.585358e-21 1.352525e-17
comp428866_c0  1.503459 4.857282  85.41288 2.421386e-20 5.277168e-17
> 
> # testing the Group DE at 05h
> lrt<-glmLRT(fit,contrast=c(0,0,0,0,-1,1,0,0,0,0))
> topTags(lrt)
Coefficient:  -1*GroupE:Time05 1*GroupJ:Time05 
                  logFC   logCPM       LR       PValue          FDR
comp436562_c0  3.559768 8.677505 245.0805 3.068810e-55 6.688165e-51
comp419735_c0  7.941315 6.207378 200.3481 1.753344e-45 1.910619e-41
comp401050_c0  3.464120 4.889617 166.9922 3.358441e-38 2.439795e-34
comp427887_c0  2.392173 6.359053 139.7174 3.069165e-32 1.672235e-28
comp429669_c1  3.377087 7.202470 133.3528 7.569234e-31 2.972487e-27
comp415376_c0 -4.542037 1.566064 133.1979 8.183411e-31 2.972487e-27
comp425652_c1 -3.682664 5.578305 117.3181 2.445272e-27 7.613181e-24
comp430561_c0 -2.003697 5.119557 108.4644 2.126367e-25 5.531729e-22
comp433200_c0  2.289307 4.954039 108.3223 2.284370e-25 5.531729e-22
comp429162_c0  1.995171 3.402911 107.9965 2.692502e-25 5.868040e-22
> 
> # testing the Group DE at 48h
> lrt<-glmLRT(fit,contrast=c(0,0,0,0,0,0,-1,1,0,0))
> topTags(lrt)
Coefficient:  -1*GroupE:Time48 1*GroupJ:Time48 
                  logFC   logCPM       LR        PValue           FDR
comp369941_c0 18.541426 9.936255 823.6359 3.919895e-181 8.543020e-177
comp417702_c0 -3.622411 3.990389 207.6927  4.377967e-47  4.770670e-43
comp417690_c0 -3.347656 5.239097 176.9797  2.212611e-40  1.607388e-36
comp436283_c0 -3.851908 1.834438 164.1105  1.430948e-37  7.796518e-34
comp417686_c0 -3.156963 3.568423 141.8250  1.062032e-32  4.629184e-29
comp417690_c1 -3.248463 4.347931 129.4769  5.333285e-30  1.937227e-26
comp423176_c0 -2.346243 5.066883 125.6650  3.640327e-29  1.133390e-25
comp431260_c0 -1.695085 6.227693 121.4132  3.102817e-28  8.452849e-25
comp419207_c0 -2.760753 7.000344 119.4983  8.146313e-28  1.972675e-24
comp435553_c0  1.785259 5.622214 110.4265  7.902139e-26  1.722192e-22
> 
> # testing the Group DE at 96h
> lrt<-glmLRT(fit,contrast=c(0,0,0,0,0,0,0,0,-1,1))
> topTags(lrt)
Coefficient:  -1*GroupE:Time96 1*GroupJ:Time96 
                  logFC   logCPM       LR        PValue           FDR
comp369941_c0 17.174042 9.936255 978.8875 6.972860e-215 1.519665e-210
comp431260_c0 -2.005239 6.227693 211.6925  5.869844e-48  6.396369e-44
comp417702_c0 -2.822779 3.990389 155.0349  1.375824e-35  9.994901e-32
comp423176_c0 -2.224759 5.066883 141.4204  1.302007e-32  7.093983e-29
comp419207_c0 -2.629442 7.000344 135.4855  2.585533e-31  1.126982e-27
comp435199_c0 -1.941443 6.089785 134.1832  4.981873e-31  1.809582e-27
comp428866_c0  1.871781 4.857282 132.4752  1.177677e-30  3.666612e-27
comp437434_c0 -1.928556 5.025404 122.7183  1.607256e-28  4.378567e-25
comp437540_c1 -1.719987 7.634490 122.0333  2.269992e-28  5.496911e-25
comp424275_c1  2.829585 2.058209 113.6028  1.592057e-26  3.469729e-23

SO, EXCEPT FOR THE FIRST TIME POINT (00) THE COMPARISONS ARE NOT SIMILAR. I WOULD VERY MUCH RECEIVE COMMENTS ON THIS AND IN PARTICULAR INDICATION(S) ON WHERE I ERR.

THANK YOU.
JAHN


 

 -- output of sessionInfo(): 

> sessionInfo()
R version 3.0.2 (2013-09-25)
Platform: x86_64-w64-mingw32/x64 (64-bit)

locale:
[1] LC_COLLATE=Norwegian (Bokmål)_Norway.1252  LC_CTYPE=Norwegian (Bokmål)_Norway.1252    LC_MONETARY=Norwegian (Bokmål)_Norway.1252 LC_NUMERIC=C                              
[5] LC_TIME=Norwegian (Bokmål)_Norway.1252    

attached base packages:
[1] splines   parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] edgeR_3.4.2        limma_3.18.6       DESeq_1.14.0       lattice_0.20-24    locfit_1.5-9.1     Biobase_2.22.0     BiocGenerics_0.8.0

loaded via a namespace (and not attached):
 [1] annotate_1.40.0      AnnotationDbi_1.24.0 DBI_0.2-7            genefilter_1.44.0    geneplotter_1.40.0   grid_3.0.2           IRanges_1.20.6       lme4_1.0-5          
 [9] MASS_7.3-29          Matrix_1.1-1.1       minqa_1.2.2          nlme_3.1-113         RColorBrewer_1.0-5   RSQLite_0.11.4       stats4_3.0.2         survival_2.37-4     
[17] tools_3.0.2          XML_3.98-1.1         xtable_1.7-1        


--
Sent via the guest posting facility at bioconductor.org.



More information about the Bioconductor mailing list