[BioC] edgeR warning message when running Trended Dispersion

Natasha Sahgal nsahgal at well.ox.ac.uk
Fri May 17 15:29:02 CEST 2013


Dear Prof. Smyth,

Thank you for your response.

True, I did not realise my mistake of passing a design matrix to the classic method! I guess, in trying various combinations I managed to confuse myself.
It worked now!


Many Thanks,
Natasha


-----Original Message-----
From: Gordon K Smyth [mailto:smyth at wehi.EDU.AU] 
Sent: 17 May 2013 03:32
To: Natasha Sahgal
Cc: Yunshun Chen; Bioconductor mailing list
Subject: RE: edgeR warning message when running Trended Dispersion

Dear Natasha,

You are trying to pass a design matrix to "classic" edgeR commands (estimateCommonDisp etc) that do not accept a design matrix as an argument and, not surprisingly, this results in an error.

If you want to specify a design matrix, regardless of the design, then you must use estimateGLMCommonDisp etc instead.  Please refer to the help pages for these functions to see what arguments can be passed.

BTW, in the current version of edgeR, there is a simpler interface available if you wish to use it which subsumes both the classic and GLM estimation routines.  You can use:

   y.filt <- estimateDisp(y.filt)

and this will work with or without a design matrix, and will compute the common, trended and tagwise dispersions all in one step.

Best wishes
Gordon

---------------------------------------------
Professor Gordon K Smyth,
Bioinformatics Division,
Walter and Eliza Hall Institute of Medical Research, 1G Royal Parade, Parkville, Vic 3052, Australia.
http://www.statsci.org/smyth

On Thu, 16 May 2013, Natasha Sahgal wrote:

> Dear Prof. Smyth,
>
> Thank you for your reply.
>
> Yes, I eventually did upgrade and realised that the warning message 
> was no longer there.
>
> However, if I may ask a related question. I decided to try the same 
> data below (but as an unpaired analysis) and got an error at the 
> commonTagwiseDisp step. (Latest version of R).
>
> Code:
>> y2.filt = y[keep, ]
>> design2 = model.matrix(~group)
>
>> y2.filt = estimateCommonDisp(y2.filt, design2, verbose=T)
> #Disp = 0.60835 , BCV = 0.78
>
>> y2.filt = estimateTagwiseDisp(y2.filt,design2)
> #Error in prior.n/ntags * m0 : non-conformable arrays
>
> I do not understand the error above. I also tried, the Trened 
> dispersion below and got an error
>> y3.filt = y2.filt
>
>> y3.filt = estimateTrendedDisp(y3.filt,design2)
> #Error in estimateTrendedDisp(y3.filt, design2) :
> #  object 'dispersion' not found
>
>
> Many Thanks,
> Natasha
>
> -----Original Message-----
> From: Gordon K Smyth [mailto:smyth at wehi.EDU.AU]
> Sent: 16 May 2013 00:01
> To: Natasha Sahgal
> Cc: Bioconductor mailing list
> Subject: edgeR warning message when running Trended Dispersion
>
> Dear Natasha,
>
> Please follow the posting guide
>
>  http://www.bioconductor.org/help/mailing-list/posting-guide/
>
> and "Ensure that you are using the latest Bioconductor release".
>
> Your software is two bioconductor releases behind.
>
> Best wishes
> Gordon
>
>> Date: Tue, 14 May 2013 16:58:46 +0000
>> From: Natasha Sahgal <nsahgal at well.ox.ac.uk>
>> To: "bioconductor at r-project.org" <bioconductor at r-project.org>
>> Subject: [BioC] edgeR warning message when running Trended Dispersion
>>
>> Dear List,
>>
>> I am also trying edger on my data (3 groups, 2 reps each).  Bacterial samples.
>>
>> design
>>  condition pair
>> 1        Cont    1
>> 2        Cont    3
>> 3        Trt1    1
>> 4        Trt1    3
>> 5        Trt2    1
>> 6        Trt2    3
>>
>> However, when I run the following code: I get a warning message and 
>> wanted to know it's significance in downstream analysis.
>
>> ----------
>> y  = DGEList(counts=gene.counts, group=group)
>> str(y)
>> y$samples
>>
>> dim(y$counts) #5578    6
>>
>> keep = rowSums(cpm(y)>10) >= 3
>> table(keep)
>> #FALSE  TRUE
>> # 1064  4514
>>
>> y.filt = y[keep, ]
>> y.filt$samples$lib.size = colSums(y.filt$counts) y.filt =
>> calcNormFactors(y.filt)
>>
>> ## Design Matrix
>> design = model.matrix(~pair+group)
>> colnames(design) = gsub("group","",colnames(design)) design
>>
>> ## Estimating Dispersion
>> y.filt = estimateGLMCommonDisp(y.filt, design, verbose=T) #Disp =
>> 0.03799 , BCV = 0.1949 y.filt = estimateGLMTrendedDisp(y.filt,design)
>> #Warning message:
>> #In binGLMDispersion(y, design, min.n = min.n, offset = offset, method = method.bin,  :
>> #  With 4514 genes and setting the parameter minimum number (min.n) of genes per bin to 500,  there are only 5 bins. Using 5 bins here means that the minimum number of genes in each of the 5 bins is in fact 515. This number of bins and minimum number of genes per bin may not be sufficient for reliable estimation of a trend on the dispersions.
>> y.filt = estimateGLMTagwiseDisp(y.filt,design)
>> --------------
>> sessionInfo()
>> R version 2.15.2 (2012-10-26)
>> Platform: x86_64-pc-linux-gnu (64-bit)
>>
>> locale:
>> [1] LC_CTYPE=en_GB.UTF-8       LC_NUMERIC=C
>> [3] LC_TIME=en_GB.UTF-8        LC_COLLATE=en_GB.UTF-8
>> [5] LC_MONETARY=en_GB.UTF-8    LC_MESSAGES=en_GB.UTF-8
>> [7] LC_PAPER=C                 LC_NAME=C
>> [9] LC_ADDRESS=C               LC_TELEPHONE=C
>> [11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C
>>
>> attached base packages:
>> [1] splines   stats     graphics  grDevices utils     datasets  methods
>> [8] base
>>
>> other attached packages:
>> [1] gdata_2.12.0   WriteXLS_2.3.0 edgeR_2.6.10   limma_3.14.3
>>
>> loaded via a namespace (and not attached):
>> [1] gtools_2.7.0
>> -------
>>
>> Any help, suggestion and advice much appreciated.
>>
>> Many Thanks,
>> Natasha

______________________________________________________________________
The information in this email is confidential and intend...{{dropped:6}}



More information about the Bioconductor mailing list