[BioC] edgeR warning message when running Trended Dispersion

Natasha Sahgal nsahgal at well.ox.ac.uk
Wed May 15 12:25:24 CEST 2013


Dear List,

Sorry for the repost realised I forgot to add some info, also attached BCV plot.

I am also trying edgeR on my data (3 groups, 2 reps each).  Bacterial samples. However, when I run the following code: I get a warning message and wanted to know it's significance in downstream analysis.
----------
>Targets
      group pair
1        Cont    1
2        Cont    3
3        Trt1    1
4        Trt1    3
5        Trt2    1
6        Trt2    3

>y  = DGEList(counts=gene.counts, group=group)

>dim(y$counts) #5578    6

> keep = rowSums(cpm(y)>10) >= 3
> table(keep)
#FALSE  TRUE
# 1064  4514

> y.filt = y[keep, ]
> y.filt$samples$lib.size = colSums(y.filt$counts)
> y.filt = calcNormFactors(y.filt)

> y.filt$samples
                group lib.size norm.factors
Cont_1  Cont  1356517    0.9656755
Cont_3  Cont  1414900    1.1070829
Trt1_1    Trt1   1382278    1.0074343
Trt1_3    Trt1   1470642    1.0018683
Trt2_1    Trt2   1379381    0.8713614
Trt2_3    Trt2   1383889    1.0635623

> design = model.matrix(~pair+group)

>y.filt = estimateGLMCommonDisp(y.filt, design, verbose=T)
#Disp = 0.03799 , BCV = 0.1949

>y.filt = estimateGLMTrendedDisp(y.filt,design)
#Warning message:
#In binGLMDispersion(y, design, min.n = min.n, offset = offset, method = method.bin,  :
#  With 4514 genes and setting the parameter minimum number (min.n) of genes per bin to 500,  there are only 5 bins. Using 5 bins here means that the minimum number of genes in each of the 5 bins is in fact 515. This number of bins and minimum number of genes per bin may not be sufficient for reliable estimation of a trend on the dispersions.


>y.filt = estimateGLMTagwiseDisp(y.filt,design)

> plotBCV(y.filt) # Plot attached

--------------
sessionInfo()
R version 2.15.2 (2012-10-26)
Platform: x86_64-pc-linux-gnu (64-bit)

locale:
[1] LC_CTYPE=en_GB.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=en_GB.UTF-8        LC_COLLATE=en_GB.UTF-8
 [5] LC_MONETARY=en_GB.UTF-8    LC_MESSAGES=en_GB.UTF-8
 [7] LC_PAPER=C                 LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] splines   stats     graphics  grDevices utils     datasets  methods
[8] base

other attached packages:
[1] gdata_2.12.0   WriteXLS_2.3.0 edgeR_2.6.10   limma_3.14.3

loaded via a namespace (and not attached):
[1] gtools_2.7.0
-------

Any help, suggestion and advice much appreciated.

Many Thanks,
Natasha
-------------- next part --------------
A non-text attachment was scrubbed...
Name: BCVplots.png
Type: image/png
Size: 18115 bytes
Desc: BCVplots.png
URL: <https://stat.ethz.ch/pipermail/bioconductor/attachments/20130515/1b4aef0e/attachment.png>


More information about the Bioconductor mailing list