[BioC] Errors for estimateGLMTrendedDisp and estimateGLMCommonDisp functions in the edgeR package
Gordon K Smyth
smyth at wehi.EDU.AU
Mon Apr 9 23:52:27 CEST 2012
Dear Hui,
The problem is a failure of the parallelized iterative algorithm for glm
fitting implemented in mglmLS(). A few people are getting this error,
and it seems to be associated with data with large dispersion values, or
with data for which the model fit is poor.
I can't easily fix your problem but, if you can send me your data offline,
I will try to trouble-shoot or suggest a work-around.
Best wishes
Gordon
> Date: Sun, 8 Apr 2012 12:11:18 -0500
> From: "Yao,Hui" <hyao at mdanderson.org>
> To: "bioconductor at r-project.org" <bioconductor at r-project.org>
> Subject: [BioC] Errors for estimateGLMTrendedDisp and
> estimateGLMCommonDisp functions in the edgeR package
>
> Dear edgeR authors and users,
> I am using the currently-released version of edgeR, edgeR_2.6.0. When I
> tried to estimate common dispersions and trended dispersions for my
> RNA-seq data, I found the errors. I am listing them as below. Any
> suggestions are more than welcome!
> ### cnts contains the count table, info contains the sample information, and libsize is the sizes of libraries
>> de <- DGEList(cnts,group=info$type,lib.size=libsize[,4])
>> dim(de)
> [1] 22205 46
>
> ### Normalization
>> de <- calcNormFactors(de)
>
> ### Filtering out the genes of which the expression levels are low for at least 45 out ### of 46 samples
>> sig.thr <- 1
>> idx <- !rowSums(cpm(de)<1)>= nrow(info)-sig.thr
>> de <- de[idx,]
>> dim(de)
> [1] 16710 46
>
> ### Set up a multi-factor model and design matrix is a full rank matrix
>> type <- info$type
>> batch <- info$batch
>> design <- model.matrix(~batch+type)
>> is.fullrank(design)
> [1] TRUE
>> design
> (Intercept) batchT typeN typeP typeX
> 1 1 0 0 0 1
> 2 1 0 0 0 1
> 3 1 0 0 0 1
> 4 1 0 0 0 1
> 5 1 0 0 0 1
> 6 1 0 0 0 1
> 7 1 0 0 0 1
> 8 1 0 0 1 0
> 9 1 0 0 0 0
> 10 1 0 0 1 0
> 11 1 0 0 1 0
> 12 1 0 1 0 0
> 13 1 0 1 0 0
> 14 1 0 1 0 0
> 15 1 1 0 0 1
> 16 1 1 0 0 1
> 17 1 1 0 0 1
> 18 1 1 0 0 1
> 19 1 1 0 0 1
> 20 1 1 0 0 1
> 21 1 1 0 0 0
> 22 1 1 0 0 0
> 23 1 1 0 0 0
> 24 1 1 0 0 1
> 25 1 1 0 0 0
> 26 1 1 0 0 0
> 27 1 1 0 0 0
> 28 1 1 0 0 0
> 29 1 1 0 0 0
> 30 1 1 0 0 0
> 31 1 1 0 1 0
> 32 1 1 0 1 0
> 33 1 1 0 1 0
> 34 1 1 0 1 0
> 35 1 1 0 1 0
> 36 1 1 0 1 0
> 37 1 1 0 1 0
> 38 1 1 0 1 0
> 39 1 1 0 1 0
> 40 1 1 0 1 0
> 41 1 1 0 1 0
> 42 1 1 0 1 0
> 43 1 1 0 1 0
> 44 1 1 0 1 0
> 45 1 1 0 1 0
> 46 1 1 0 1 0
> attr(,"assign")
> [1] 0 1 2 2 2
> attr(,"contrasts")
> attr(,"contrasts")$batch
> [1] "contr.treatment"
>
> attr(,"contrasts")$type
> [1] "contr.treatment"
>
> ### the ERROR for estimateGLMCommonDisp
>> estimateGLMCommonDisp(de,design)
> Error in beta[k, ] <- betaj[decr, ] :
> NAs are not allowed in subscripted assignments
>
>> traceback()
> 12: mglmLS(y, design = design, dispersion = dispersion, start = start,
> offset = offset, ...)
> 11: glmFit.default(y, design = design, dispersion = dispersion, offset = offset)
> 10: glmFit(y, design = design, dispersion = dispersion, offset = offset)
> 9: adjustedProfileLik(par^4, y, design, offset)
> 8: f(arg, ...)
> 7: function (arg)
> -f(arg, ...)(0.540181513475453)
> 6: optimize(f = fun, interval = interval^0.25, y = y, design = design,
> offset = offset, maximum = TRUE, tol = tol)
> 5: dispCoxReid(y, design = design, offset = offset, ...)
> 4: estimateGLMCommonDisp.default(y = y$counts, design = design,
> offset = offset, method = method, ...)
> 3: estimateGLMCommonDisp(y = y$counts, design = design, offset = offset,
> method = method, ...)
> 2: estimateGLMCommonDisp.DGEList(de, design)
> 1: estimateGLMCommonDisp(de, design)
>
> ### the ERROR for estimateGLMTrendedDisp
>> estimateGLMTrendedDisp(de,design)
> Error in if (any(decr)) { : missing value where TRUE/FALSE needed
> > traceback()
> 16: mglmLS(y, design = design, dispersion = dispersion, start = start,
> offset = offset, ...)
> 15: glmFit.default(y, design = design, dispersion = dispersion, offset = offset)
> 14: glmFit(y, design = design, dispersion = dispersion, offset = offset)
> 13: adjustedProfileLik(par^4, y, design, offset)
> 12: f(arg, ...)
> 11: function (arg)
> -f(arg, ...)(1.08036302695091)
> 10: optimize(f = fun, interval = interval^0.25, y = y, design = design,
> offset = offset, maximum = TRUE, tol = tol)
> 9: dispCoxReid(y, design = design, offset = offset, ...)
> 8: estimateGLMCommonDisp.default(y[bin, ], design, method = method,
> offset[bin, ], min.row.sum = 0, ...)
> 7: estimateGLMCommonDisp(y[bin, ], design, method = method, offset[bin,
> ], min.row.sum = 0, ...)
> 6: binGLMDispersion(y, design, min.n = min.n, offset = offset, method = method.bin,
> ...)
> 5: dispBinTrend(y, design, offset = offset, method.trend = "spline",
> ...)
> 4: estimateGLMTrendedDisp.default(y = y$counts, design = design,
> offset = offset, method = method, ...)
> 3: estimateGLMTrendedDisp(y = y$counts, design = design, offset = offset,
> method = method, ...)
> 2: estimateGLMTrendedDisp.DGEList(de, design)
> 1: estimateGLMTrendedDisp(de, design)
>
> ### sessionInfo
>> sessionInfo()
> R version 2.14.0 (2011-10-31)
> Platform: x86_64-unknown-linux-gnu (64-bit)
>
> locale:
> [1] LC_CTYPE=en_US LC_NUMERIC=C LC_TIME=en_US
> [4] LC_COLLATE=en_US LC_MONETARY=en_US LC_MESSAGES=en_US
> [7] LC_PAPER=C LC_NAME=C LC_ADDRESS=C
> [10] LC_TELEPHONE=C LC_MEASUREMENT=en_US LC_IDENTIFICATION=C
>
> attached base packages:
> [1] stats graphics grDevices utils datasets methods base
>
> other attached packages:
> [1] edgeR_2.6.0 limma_3.10.3
>
> Thanks in advance,
>
> Hui
>
> Hui Yao, Ph.D.
> Principal Statistical Analyst
> MD Anderson Cancer Center
>
______________________________________________________________________
The information in this email is confidential and intend...{{dropped:4}}
More information about the Bioconductor
mailing list