[BioC] edgeR

Jahn Davik jahn.davik at bioforsk.no
Thu Jan 30 08:12:57 CET 2014


Thank you for your reply, Gordon.
Could I bother you to advice me how to do that - do the same tests under the glm approach as in the classic approach? I would be much obliged and I might learn something. I come from the SAS world and R thinking is still a little cryptic to me.

Best regards, 
jahn
-----Opprinnelig melding-----
Fra: Gordon K Smyth [mailto:smyth at wehi.EDU.AU] 
Sendt: 30. januar 2014 01:34
Til: Jahn Davik
Kopi: Bioconductor mailing list
Emne: edgeR

Dear Jahn,

The classic and glm approaches do give similar results, but if they were identical then we wouldn't need to keep both in package.  No two methods will give thousands of genes in exactly the same order however.

In this case, though, it doesn't seem to me that you are testing the same comparisons for the classic and glm approaches. All the comparisons that you make in the classic approach are between groups at a fixed time (Group nested within Time) but the factorial model used

   design<-model.matrix(~Group+Group:Time)

has Time nested within Group.  Except at time zero, it doesn't seem to me that you are testing the same comparisons for the classic and glm approaches.

Best wishes
Gordon

> Date: Tue, 28 Jan 2014 23:40:04 -0800 (PST)
> From: "Jahn Davik [guest]" <guest at bioconductor.org>
> To: bioconductor at r-project.org, jahn.davik at bioforsk.no
> Subject: [BioC] edgeR
>
>
> Hi,

> I am using edgeR to analyse counts data from an experiment consisting 
> of two genotypes sampled at five time points. Three biological replicates.
> I have used the 'Classic approach' and the 'GLM approach', and 
> expected to get the same results either way. At least, I expected to 
> find the same topTags irrespectively of approach. (But maybe that is 
> wrong?)
>
> Here is what I've done:
>
> CLASSIC APPROACH:
>> # Classic part
>>
>> y<-DGEList(counts=x,group=Group)
>> y$samples
>      group lib.size norm.factors
> E00R1   E00 57811138            1
> E00R2   E00 51440945            1
> E00R3   E00 48826680            1
> E01R1   E01 45632509            1
> E01R2   E01 58231864            1
> E01R3   E01 52760716            1
> E05R1   E05 56470353            1
> E05R2   E05 49744068            1
> E05R3   E05 46002201            1
> E48R2   E48 40331851            1
> E48R3   E48 52134771            1
> E96R1   E96 49449978            1
> E96R2   E96 44233768            1
> E96R3   E96 57043515            1
> J00R1   J00 58919765            1
> J00R2   J00 40388557            1
> J00R3   J00 54126457            1
> J01R1   J01 57302722            1
> J01R2   J01 56813843            1
> J01R3   J01 46711117            1
> J05R1   J05 64733652            1
> J05R2   J05 48922602            1
> J05R3   J05 28696513            1
> J48R2   J48 49544429            1
> J48R3   J48 42298884            1
> J96R1   J96 51538248            1
> J96R2   J96 66995708            1
> J96R3   J96 38325422            1
>>
>> levels(y$samples$group)
> [1] "E00" "E01" "E05" "E48" "E96" "J00" "J01" "J05" "J48" "J96"
>> y<-calcNormFactors(y)
>
>
>> # estimating dispersion
>> # over all genes
>> y<-estimateCommonDisp(y)
>>
>> # genewise
>>
>> y<-estimateTagwiseDisp(y)
>>
>> plotBCV(y)
>>
>> et<-exactTest(y, pair=c("E00","J00"))
>> topTags(et)
> Comparison of groups:  J00-E00
>                   logFC   logCPM        PValue           FDR
> comp394477_c0 -15.422419 5.602823  0.000000e+00  0.000000e+00
> comp413538_c1 -11.685657 4.694669  0.000000e+00  0.000000e+00
> comp431566_c0 -11.615584 4.405613  0.000000e+00  0.000000e+00
> comp404433_c0   8.790634 4.710831  0.000000e+00  0.000000e+00
> comp412635_c0   6.325601 4.519383  0.000000e+00  0.000000e+00
> comp431005_c0   4.235474 6.494423  0.000000e+00  0.000000e+00
> comp373625_c1   6.893877 3.728574 3.134846e-288 9.760118e-285
> comp421285_c2  -4.744062 5.016100 6.226410e-278 1.696230e-274
> comp419056_c0  13.453985 2.801532 6.846340e-239 1.657879e-235
> comp401947_c0   7.987398 2.858410 1.078798e-217 2.351132e-214
>>
>> et<-exactTest(y, pair=c("E01","J01"))
>> topTags(et)
> Comparison of groups:  J01-E01
>                   logFC   logCPM        PValue           FDR
> comp413538_c1 -14.570734 4.694669  0.000000e+00  0.000000e+00
> comp431566_c0 -14.185398 4.405613  0.000000e+00  0.000000e+00
> comp394477_c0 -14.137519 5.602823  0.000000e+00  0.000000e+00
> comp404433_c0  10.181217 4.710831  0.000000e+00  0.000000e+00
> comp373625_c1   9.738900 3.728574  0.000000e+00  0.000000e+00
> comp412635_c0   7.077064 4.519383  0.000000e+00  0.000000e+00
> comp431005_c0   4.393042 6.494423  0.000000e+00  0.000000e+00
> comp421285_c2  -4.656943 5.016100 2.038437e-267 5.553212e-264
> comp404433_c1  13.310478 3.132245 1.777238e-232 4.303681e-229
> comp419056_c0  11.369324 2.801532 2.968126e-222 6.468734e-219
>>
>> et<-exactTest(y, pair=c("E05","J05"))
>> topTags(et)
> Comparison of groups:  J05-E05
>                   logFC   logCPM        PValue           FDR
> comp404433_c0  12.392403 4.710831  0.000000e+00  0.000000e+00
> comp394477_c0 -11.251416 5.602823  0.000000e+00  0.000000e+00
> comp413538_c1 -11.075882 4.694669  0.000000e+00  0.000000e+00
> comp412635_c0   7.071984 4.519383  0.000000e+00  0.000000e+00
> comp431005_c0   3.866607 6.494423 5.618431e-314 2.448962e-310
> comp373625_c1   9.542258 3.728574 4.235325e-310 1.538411e-306
> comp431566_c0 -11.206496 4.405613 1.889575e-289 5.883056e-286
> comp421285_c2  -4.448176 5.016100 4.956948e-245 1.350396e-241
> comp379820_c0  -6.741883 4.283112 4.759103e-221 1.152443e-217
> comp401947_c0  12.325326 2.858410 2.865886e-201 6.245911e-198
>>
>> et<-exactTest(y, pair=c("E48","J48"))
>> topTags(et)
> Comparison of groups:  J48-E48
>                   logFC   logCPM        PValue           FDR
> comp413538_c1 -11.577416 4.694669 9.806810e-268 2.137296e-263
> comp412635_c0   7.534573 4.519383 4.519394e-256 4.924784e-252
> comp373625_c1  11.099259 3.728574 4.211513e-231 3.059524e-227
> comp421285_c2  -5.694947 5.016100 8.898521e-225 4.848359e-221
> comp369941_c0  19.008105 9.936255 1.581987e-216 6.895566e-213
> comp431566_c0 -13.422262 4.405613 1.567587e-181 5.693999e-178
> comp404433_c0  12.908316 4.710831 2.647072e-159 8.241469e-156
> comp394477_c0 -11.329087 5.602823 8.649066e-159 2.356222e-155
> comp423647_c1  -3.235352 5.607951 1.216586e-156 2.946030e-153
> comp379820_c0  -6.545352 4.283112 1.001698e-152 2.183100e-149
>>
>> et<-exactTest(y, pair=c("E96","J96"))
>> topTags(et)
> Comparison of groups:  J96-E96
>                   logFC   logCPM        PValue           FDR
> comp413538_c1 -14.558952 4.694669  0.000000e+00  0.000000e+00
> comp431566_c0 -14.309061 4.405613  0.000000e+00  0.000000e+00
> comp373625_c1  10.996613 3.728574  0.000000e+00  0.000000e+00
> comp412635_c0   6.482673 4.519383  0.000000e+00  0.000000e+00
> comp421285_c2  -6.022731 5.016100  0.000000e+00  0.000000e+00
> comp404433_c0  12.027340 4.710831 3.172936e-312 1.152516e-308
> comp369941_c0  17.982161 9.936255 2.109743e-292 6.568535e-289
> comp423647_c1  -3.434699 5.607951 7.129247e-256 1.942185e-252
> comp431005_c0   3.384003 6.494423 7.465632e-247 1.807844e-243
> comp394477_c0 -13.722197 5.602823 9.993279e-244 2.177935e-240
>>
>
> MY INTENTION SO FAR HAS BEEN TO COMPARE THE GENOTYPES, E AND J, AT THE DIFFERENT TIME POINTS.
> SO FAR, SO GOOD (?)
>
> MOVING OVER TO THE GLM APPROACH:
>
>> z<-x
>> z<-DGEList(counts=x,group=Group)
>> z$samples
>      group lib.size norm.factors
> E00R1     E 57811138            1
> E00R2     E 51440945            1
> E00R3     E 48826680            1
> E01R1     E 45632509            1
> E01R2     E 58231864            1
> E01R3     E 52760716            1
> E05R1     E 56470353            1
> E05R2     E 49744068            1
> E05R3     E 46002201            1
> E48R2     E 40331851            1
> E48R3     E 52134771            1
> E96R1     E 49449978            1
> E96R2     E 44233768            1
> E96R3     E 57043515            1
> J00R1     J 58919765            1
> J00R2     J 40388557            1
> J00R3     J 54126457            1
> J01R1     J 57302722            1
> J01R2     J 56813843            1
> J01R3     J 46711117            1
> J05R1     J 64733652            1
> J05R2     J 48922602            1
> J05R3     J 28696513            1
> J48R2     J 49544429            1
> J48R3     J 42298884            1
> J96R1     J 51538248            1
> J96R2     J 66995708            1
> J96R3     J 38325422            1
>>
>> z<-calcNormFactors(z)
>> z<-x
>> z<-DGEList(counts=z,group=Group)
>> z$samples
>      group lib.size norm.factors
> E00R1     E 57811138            1
> E00R2     E 51440945            1
> E00R3     E 48826680            1
> E01R1     E 45632509            1
> E01R2     E 58231864            1
> E01R3     E 52760716            1
> E05R1     E 56470353            1
> E05R2     E 49744068            1
> E05R3     E 46002201            1
> E48R2     E 40331851            1
> E48R3     E 52134771            1
> E96R1     E 49449978            1
> E96R2     E 44233768            1
> E96R3     E 57043515            1
> J00R1     J 58919765            1
> J00R2     J 40388557            1
> J00R3     J 54126457            1
> J01R1     J 57302722            1
> J01R2     J 56813843            1
> J01R3     J 46711117            1
> J05R1     J 64733652            1
> J05R2     J 48922602            1
> J05R3     J 28696513            1
> J48R2     J 49544429            1
> J48R3     J 42298884            1
> J96R1     J 51538248            1
> J96R2     J 66995708            1
> J96R3     J 38325422            1
>>
>> z<-calcNormFactors(z)
>> Group<-factor(substring(colnames(z),1,1))
>> Group
> [1] E E E E E E E E E E E E E E J J J J J J J J J J J J J J
> Levels: E J
>>
>> Time<-factor(substring(colnames(z),2,3))
>> Time
> [1] 00 00 00 01 01 01 05 05 05 48 48 96 96 96 00 00 00 01 01 01 05 05 
> 05 48 48 96 96 96
> Levels: 00 01 05 48 96
>>
>> design<-model.matrix(~Group+Group:Time)
>> rownames(design)<-colnames(z)
>> design
>      (Intercept) GroupJ GroupE:Time01 GroupJ:Time01 GroupE:Time05 GroupJ:Time05 GroupE:Time48 GroupJ:Time48 GroupE:Time96 GroupJ:Time96
> E00R1           1      0             0             0             0             0             0             0             0             0
> E00R2           1      0             0             0             0             0             0             0             0             0
> E00R3           1      0             0             0             0             0             0             0             0             0
> E01R1           1      0             1             0             0             0             0             0             0             0
> E01R2           1      0             1             0             0             0             0             0             0             0
> E01R3           1      0             1             0             0             0             0             0             0             0
> E05R1           1      0             0             0             1             0             0             0             0             0
> E05R2           1      0             0             0             1             0             0             0             0             0
> E05R3           1      0             0             0             1             0             0             0             0             0
> E48R2           1      0             0             0             0             0             1             0             0             0
> E48R3           1      0             0             0             0             0             1             0             0             0
> E96R1           1      0             0             0             0             0             0             0             1             0
> E96R2           1      0             0             0             0             0             0             0             1             0
> E96R3           1      0             0             0             0             0             0             0             1             0
> J00R1           1      1             0             0             0             0             0             0             0             0
> J00R2           1      1             0             0             0             0             0             0             0             0
> J00R3           1      1             0             0             0             0             0             0             0             0
> J01R1           1      1             0             1             0             0             0             0             0             0
> J01R2           1      1             0             1             0             0             0             0             0             0
> J01R3           1      1             0             1             0             0             0             0             0             0
> J05R1           1      1             0             0             0             1             0             0             0             0
> J05R2           1      1             0             0             0             1             0             0             0             0
> J05R3           1      1             0             0             0             1             0             0             0             0
> J48R2           1      1             0             0             0             0             0             1             0             0
> J48R3           1      1             0             0             0             0             0             1             0             0
> J96R1           1      1             0             0             0             0             0             0             0             1
> J96R2           1      1             0             0             0             0             0             0             0             1
> J96R3           1      1             0             0             0             0             0             0             0             1
> attr(,"assign")
> [1] 0 1 2 2 2 2 2 2 2 2
> attr(,"contrasts")
> attr(,"contrasts")$Group
> [1] "contr.treatment"
>
> attr(,"contrasts")$Time
> [1] "contr.treatment"
>
>> z<-estimateGLMCommonDisp(z,design)
>>
>> # genewise
>> z<-estimateGLMTrendedDisp(z,design)
>> z<-estimateGLMTagwiseDisp(z,design)
>>
>> fit<-glmFit(z,design)
>> lrt<-glmLRT(fit,coef=2)
>> topTags(lrt)
> Coefficient:  GroupJ
>                   logFC   logCPM       LR        PValue           FDR
> comp394477_c0 -15.320108 5.602823 1599.742  0.000000e+00  0.000000e+00
> comp413538_c1 -11.668893 4.694669 2063.236  0.000000e+00  0.000000e+00
> comp431566_c0 -11.598820 4.405613 1505.514  0.000000e+00  0.000000e+00
> comp404433_c0   8.789010 4.710831 1614.360  0.000000e+00  0.000000e+00
> comp412635_c0   6.325157 4.519383 1638.631  0.000000e+00  0.000000e+00
> comp431005_c0   4.235436 6.494423 1572.862  0.000000e+00  0.000000e+00
> comp421285_c2  -4.743959 5.016100 1273.102 7.917312e-279 2.464999e-275
> comp373625_c1   6.892714 3.728574 1255.435 5.468549e-275 1.489769e-271
> comp419056_c0  13.351686 2.801532 1103.131 6.893184e-242 1.669223e-238
> comp401947_c0   7.983599 2.858410 1005.328 1.247665e-220 2.719160e-217
>>
>> # testing the Group DE at 01h
>> lrt<-glmLRT(fit,contrast=c(0,0,-1,1,0,0,0,0,0,0))
>> topTags(lrt)
> Coefficient:  -1*GroupE:Time01 1*GroupJ:Time01
>                  logFC   logCPM        LR       PValue          FDR
> comp401050_c0  3.493540 4.889617 169.62289 8.943982e-39 1.949251e-34
> comp427887_c0  2.494577 6.359053 151.68090 7.439934e-35 8.107296e-31
> comp429669_c1  3.461429 7.202470 139.75242 3.015465e-32 2.190635e-28
> comp425652_c1 -3.747270 5.578305 121.36647 3.176789e-28 1.467673e-24
> comp431260_c0 -1.510660 6.227693 121.25102 3.367149e-28 1.467673e-24
> comp428540_c0  2.458811 4.676311  95.33553 1.607026e-22 5.837254e-19
> comp437641_c0  1.729965 5.735428  89.12381 3.708599e-21 1.010451e-17
> comp415376_c0 -3.768543 1.566064  89.12355 3.709096e-21 1.010451e-17
> comp419207_c0 -2.096465 7.000344  88.31376 5.585358e-21 1.352525e-17
> comp428866_c0  1.503459 4.857282  85.41288 2.421386e-20 5.277168e-17
>>
>> # testing the Group DE at 05h
>> lrt<-glmLRT(fit,contrast=c(0,0,0,0,-1,1,0,0,0,0))
>> topTags(lrt)
> Coefficient:  -1*GroupE:Time05 1*GroupJ:Time05
>                  logFC   logCPM       LR       PValue          FDR
> comp436562_c0  3.559768 8.677505 245.0805 3.068810e-55 6.688165e-51
> comp419735_c0  7.941315 6.207378 200.3481 1.753344e-45 1.910619e-41
> comp401050_c0  3.464120 4.889617 166.9922 3.358441e-38 2.439795e-34
> comp427887_c0  2.392173 6.359053 139.7174 3.069165e-32 1.672235e-28
> comp429669_c1  3.377087 7.202470 133.3528 7.569234e-31 2.972487e-27
> comp415376_c0 -4.542037 1.566064 133.1979 8.183411e-31 2.972487e-27
> comp425652_c1 -3.682664 5.578305 117.3181 2.445272e-27 7.613181e-24
> comp430561_c0 -2.003697 5.119557 108.4644 2.126367e-25 5.531729e-22
> comp433200_c0  2.289307 4.954039 108.3223 2.284370e-25 5.531729e-22
> comp429162_c0  1.995171 3.402911 107.9965 2.692502e-25 5.868040e-22
>>
>> # testing the Group DE at 48h
>> lrt<-glmLRT(fit,contrast=c(0,0,0,0,0,0,-1,1,0,0))
>> topTags(lrt)
> Coefficient:  -1*GroupE:Time48 1*GroupJ:Time48
>                  logFC   logCPM       LR        PValue           FDR
> comp369941_c0 18.541426 9.936255 823.6359 3.919895e-181 8.543020e-177
> comp417702_c0 -3.622411 3.990389 207.6927  4.377967e-47  4.770670e-43
> comp417690_c0 -3.347656 5.239097 176.9797  2.212611e-40  1.607388e-36
> comp436283_c0 -3.851908 1.834438 164.1105  1.430948e-37  7.796518e-34
> comp417686_c0 -3.156963 3.568423 141.8250  1.062032e-32  4.629184e-29
> comp417690_c1 -3.248463 4.347931 129.4769  5.333285e-30  1.937227e-26
> comp423176_c0 -2.346243 5.066883 125.6650  3.640327e-29  1.133390e-25
> comp431260_c0 -1.695085 6.227693 121.4132  3.102817e-28  8.452849e-25
> comp419207_c0 -2.760753 7.000344 119.4983  8.146313e-28  1.972675e-24
> comp435553_c0  1.785259 5.622214 110.4265  7.902139e-26  1.722192e-22
>>
>> # testing the Group DE at 96h
>> lrt<-glmLRT(fit,contrast=c(0,0,0,0,0,0,0,0,-1,1))
>> topTags(lrt)
> Coefficient:  -1*GroupE:Time96 1*GroupJ:Time96
>                  logFC   logCPM       LR        PValue           FDR
> comp369941_c0 17.174042 9.936255 978.8875 6.972860e-215 1.519665e-210
> comp431260_c0 -2.005239 6.227693 211.6925  5.869844e-48  6.396369e-44
> comp417702_c0 -2.822779 3.990389 155.0349  1.375824e-35  9.994901e-32
> comp423176_c0 -2.224759 5.066883 141.4204  1.302007e-32  7.093983e-29
> comp419207_c0 -2.629442 7.000344 135.4855  2.585533e-31  1.126982e-27
> comp435199_c0 -1.941443 6.089785 134.1832  4.981873e-31  1.809582e-27
> comp428866_c0  1.871781 4.857282 132.4752  1.177677e-30  3.666612e-27
> comp437434_c0 -1.928556 5.025404 122.7183  1.607256e-28  4.378567e-25
> comp437540_c1 -1.719987 7.634490 122.0333  2.269992e-28  5.496911e-25
> comp424275_c1  2.829585 2.058209 113.6028  1.592057e-26  3.469729e-23
>
> SO, EXCEPT FOR THE FIRST TIME POINT (00) THE COMPARISONS ARE NOT 
> SIMILAR. I WOULD VERY MUCH RECEIVE COMMENTS ON THIS AND IN PARTICULAR
> INDICATION(S) ON WHERE I ERR.
>
> THANK YOU.
> JAHN
>
>
>
>
> -- output of sessionInfo():
>
>> sessionInfo()
> R version 3.0.2 (2013-09-25)
> Platform: x86_64-w64-mingw32/x64 (64-bit)
>
> locale:
> [1] LC_COLLATE=Norwegian (Bokm??l)_Norway.1252  LC_CTYPE=Norwegian (Bokm??l)_Norway.1252    LC_MONETARY=Norwegian (Bokm??l)_Norway.1252 LC_NUMERIC=C
> [5] LC_TIME=Norwegian (Bokm??l)_Norway.1252
>
> attached base packages:
> [1] splines   parallel  stats     graphics  grDevices utils     datasets  methods   base
>
> other attached packages:
> [1] edgeR_3.4.2        limma_3.18.6       DESeq_1.14.0       lattice_0.20-24    locfit_1.5-9.1     Biobase_2.22.0     BiocGenerics_0.8.0
>
> loaded via a namespace (and not attached):
> [1] annotate_1.40.0      AnnotationDbi_1.24.0 DBI_0.2-7            genefilter_1.44.0    geneplotter_1.40.0   grid_3.0.2           IRanges_1.20.6       lme4_1.0-5
> [9] MASS_7.3-29          Matrix_1.1-1.1       minqa_1.2.2          nlme_3.1-113         RColorBrewer_1.0-5   RSQLite_0.11.4       stats4_3.0.2         survival_2.37-4
> [17] tools_3.0.2          XML_3.98-1.1         xtable_1.7-1

______________________________________________________________________
The information in this email is confidential and intend...{{dropped:6}}



More information about the Bioconductor mailing list