[BioC] edgeR and pair-wise comparison error by estimateGLMCommonDisp

Gordon K Smyth smyth at wehi.EDU.AU
Tue Nov 8 00:10:36 CET 2011


Dear Sheng,

I assume that you are analysing your own data, not the Tuch case study, 
because this error doesn't occur on the Tuch data.

The error message is from the function mglmLS(), which is the low-level 
function in edgeR that does the glm fitting.  mglmLS() is more reliable 
than glm(), but it still fails occasionally, and I have been unable to 
track down the reason why so far.

The solution is probably to give the user the option of switching to 
another fitting method (using one of the other options provided by 
glmFit), but we have not yet implemented this option in 
estimateGLMCommonDisp().

Would you be willing to share the data set with us offline so that we can 
try to diagnose the problem?

Things to try in the meantime:

1. Try estimateGLMCommonDisp with method="deviance" instead.  Might or 
might not help.

2. Try subsetting your dataset d to remove genes that might be causing the 
problem.  Most likely culprits are genes with large goodness of fit 
statistics.

3. You could analyse the dataset instead using the zoom() function in the 
limma package, see the last case study in that package.  This gives 
similar results to edgeR.

Best wishes
Gordon

> Date: Sun,  6 Nov 2011 16:07:17 -0800 (PST)
> From: "Sheng [guest]" <guest at bioconductor.org>
> To: bioconductor at r-project.org, shelly1436 at gmail.com
> Subject: [BioC] edgeR and pair-wise comparison error by
> 	estimateGLMCommonDisp
>
>
> Hi All,
>
> I am trying to use edgeR to find differentially expressed genes from a pairwise comparison. So I am following the edgeR's manual chapter 13: Case study: Oral carcinomas vs matched normal tissue. However, when I am trying to "estimateCommonDisp" to estimate the Cox-Reid common dispersion, I got an error:
> "Error in beta[k, ] <- betaj[decr, ] :
>  NAs are not allowed in subscripted assignments"
>
> Does anyone have ideas about this? I have 6 paired data. Thanks a lot!
>
> My code is following:
>
> x = X[,-(1:2)]
> geneid = X[,1]
> rownames(x)=geneid
> colnames(x)=colnames(X)[-(1:2)]
>
> group=factor(rep(c("N", "C"), 6))
>
> # build edgeR object
> d = DGEList(x,group=group, genes=rownames(x))
>
> # filter out genes have fewer than 1 count per million
> cpm = cpm(d)
> d <- d[rowSums(cpm > 1) >= 3, ]
>
> # plot MDS
> # plotMDS(d, main="MDS Plot for Tuch Data")
>
> # design matrix
> patient = factor(rep(1:6, each=2))
> design <- model.matrix(~patient+d$samples$group)
> rownames(design) <- rownames(d$samples)
> colnames(design)[7] <- "Cancer"
>
> # calculate
> d <- estimateGLMCommonDisp(d, design)
>
>
> Cheers,
> Sheng
>
> -- output of sessionInfo():
>
> R version 2.14.0 (2011-10-31)
> Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
>
> locale:
> [1] C/en_US.UTF-8/C/C/C/C
>
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
>
> other attached packages:
> [1] limma_3.10.0 stringr_0.5  edgeR_2.4.0
>
> loaded via a namespace (and not attached):
> [1] plyr_1.6     tools_2.14.0
>
> --
> Sent via the guest posting facility at bioconductor.org.

______________________________________________________________________
The information in this email is confidential and intend...{{dropped:4}}



More information about the Bioconductor mailing list