[BioC] Problems with edgeR package
Mark Robinson
mark.robinson at imls.uzh.ch
Thu May 30 21:01:29 CEST 2013
Dear Eduardo,
>> sessionInfo()
> R version 2.15.1 (2012-06-22)
> Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
> other attached packages:
> [1] edgeR_2.6.12 limma_3.12.3 DESeq_1.8.3 locfit_1.5-9
The version of edgeR you are using is >1 year old.
Before we go any further, I suggest you upgrade to R 3.0.x and the corresponding BioC/edgeR series (3.2.x) and try everything everything again. Report back if you have struggles.
Best regards, Mark
----------
Prof. Dr. Mark Robinson
Bioinformatics, Institute of Molecular Life Sciences
University of Zurich
http://tiny.cc/mrobin
On 30.05.2013, at 20:53, Eduardo Andrés León <eandres at cnio.es> wrote:
> Dear all, I'm trying to analyse an experiment from Small RNASeq (miRNA Seq).
>
> In this experiment I have 2 different types of mice (MT and WT) and a treatment variables (treated or cD and untreated or sD). So there will be 4 types of data : WTcD (treated WT), WTsD (untreated WT), MTcD (Treated Mutant) and MTsD (Untreated Mutant)
>
> So, my design is something likes this :
>
>> design
> (Intercept) MouseWT TreatmentUntreated
> WTsD_1_9.5 1 1 1
> WTsD_2_9.5 1 1 1
> WTsD_3_9.5 1 1 1
> WTcD_1_9.5 1 1 0
> WTcD_2_9.5 1 1 0
> WTcD_3_9.5 1 1 0
> MTsD_2_9.5 1 0 1
> MTsD_3_9.5 1 0 1
> MTcD_1_9.5 1 0 0
> MTcD_2_9.5 1 0 0
> MTcD_3_9.5 1 0 0
> attr(,"assign")
> [1] 0 1 2
> attr(,"contrasts")
> attr(,"contrasts")$Mouse
> [1] "contr.treatment"
>
> attr(,"contrasts")$Treatment
> [1] "contr.treatment"
>
> I have 3 replicates for every mouse, but the MTsD_1 seems to be an outlayer.
>
> So following the userguide by the edgeR packages (Section 4.4), I run the following commands :
>
> rawdata<-read.delim(file="ALL_MTcD_9.5",header=TRUE,row.names=1)
> y<-DGEList(counts=rawdata[,c(1,2,3,4,5,6,8,9,10,11,12)],genes=rawdata[,0])
> y<-calcNormFactors(y)
> plotMDS(y,xlab="Mouse factor",ylab="Treatment Factor",xlim=c(-0.4,0.4),ylim=c(-0.4,0.4))
>
> Mouse<-factor(c("WT","WT","WT","WT","WT","WT","MT","MT","MT","MT","MT"))
> Treatment<-factor(c("Untreated","Untreated","Untreated","Treated","Treated","Treated","Untreated","Untreated","Treated","Treated","Treated"))
>
> data.frame(Sample=colnames(y),Mouse,Treatment)
>
> design<-model.matrix(~Mouse+Treatment)
> rownames(design)<-colnames(y)
>
> #Overall dispersion of the dataset
> y<-estimateGLMCommonDisp(y,design,verbose=TRUE)
>
> #Estimation of the gene-wise dispersion
> y<-estimateGLMTrendedDisp(y,design)
> y<-estimateGLMTagwiseDisp(y,design)
>
> #Diff expresion
> fit<-glmFit(y,design)
> lrt<-glmLRT(fit)
> topTags(ltr)
>
>
> But it stops at the estimateGLMTrendedDisp steps .
>
>
> The output is the following :
>
>> library(limma)
>> library(edgeR)
>> rawdata<-read.delim(file="ALL_MTcD_9.5",header=TRUE,row.names=1)
>> y<-DGEList(counts=rawdata[,c(1,2,3,4,5,6,8,9,10,11,12)],genes=rawdata[,0])
> Calculating library sizes from column totals.
>> y<-calcNormFactors(y)
>> plotMDS(y,xlab="Mouse factor",ylab="Treatment Factor",xlim=c(-0.4,0.4),ylim=c(-0.4,0.4))
>>
>> Mouse<-factor(c("WT","WT","WT","WT","WT","WT","MT","MT","MT","MT","MT"))
>> Treatment<-factor(c("Untreated","Untreated","Untreated","Treated","Treated","Treated","Untreated","Untreated","Treated","Treated","Treated"))
>> data.frame(Sample=colnames(y),Mouse,Treatment)
> Sample Mouse Treatment
> 1 WTsD_1_9.5 WT Untreated
> 2 WTsD_2_9.5 WT Untreated
> 3 WTsD_3_9.5 WT Untreated
> 4 WTcD_1_9.5 WT Treated
> 5 WTcD_2_9.5 WT Treated
> 6 WTcD_3_9.5 WT Treated
> 7 MTsD_2_9.5 MT Untreated
> 8 MTsD_3_9.5 MT Untreated
> 9 MTcD_1_9.5 MT Treated
> 10 MTcD_2_9.5 MT Treated
> 11 MTcD_3_9.5 MT Treated
>> design<-model.matrix(~Mouse+Treatment)
>> rownames(design)<-colnames(y)
>>
>> #Overall dispersion of the dataset
>> y<-estimateGLMCommonDisp(y,design,verbose=TRUE)
> Disp = 0.02052 , BCV = 0.1432
>> #Estimation of the gene-wise dispersion
>> y<-estimateGLMTrendedDisp(y,design)
> Warning messages:
> 1: In binGLMDispersion(y, design, min.n = min.n, offset = offset, method = method.bin, :
> With 1273 genes and setting the parameter minimum number (min.n) of genes per bin to 500, there should technically be fewer than 2 bins. To make estimation of trended dispersions possible we set the number of bins to be 2.
>
> 2: In binGLMDispersion(y, design, min.n = min.n, offset = offset, method = method.bin, :
> With 1273 genes and setting the parameter minimum number (min.n) of genes per bin to 500, there are only 2 bins. Using 2 bins here means that the minimum number of genes in each of the 2 bins is in fact 302. This number of bins and minimum number of genes per bin may not be sufficient for reliable estimation of a trend on the dispersions.
>
>> y<-estimateGLMTagwiseDisp(y,design)
> Warning message:
> In movingAverageByCol(apl[o, ], width = 1000) :
> reducing moving average width to nrow(x)
>> #Diff expresion
>> fit<-glmFit(y,design)
>> lrt<-glmLRT(fit)
> Error in array(x, c(length(x), 1L), if (!is.null(names(x))) list(names(x), :
> dims [product 0] do not match the length of object [14]
>
> and ... I don't know what to do.
>
> Any help would be very appreciated
>
> My session info is :
>
>> sessionInfo()
> R version 2.15.1 (2012-06-22)
> Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
>
> locale:
> [1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8
>
> attached base packages:
> [1] splines stats graphics grDevices utils datasets methods base
>
> other attached packages:
> [1] edgeR_2.6.12 limma_3.12.3 DESeq_1.8.3 locfit_1.5-9 AnnotationDbi_1.18.4 Biobase_2.16.0 BiocGenerics_0.2.0
>
> loaded via a namespace (and not attached):
> [1] annotate_1.34.1 DBI_0.2-5 genefilter_1.38.0 geneplotter_1.34.0 grid_2.15.1 IRanges_1.14.4 lattice_0.20-15 RColorBrewer_1.0-5
> [9] RSQLite_0.11.2 stats4_2.15.1 survival_2.37-4 tools_2.15.1 XML_3.96-1.1 xtable_1.7-1
>
>
> Thanks in advanced
>
>
> ===================================================
> Eduardo Andrés León
> Tlfn: (+34) 91 732 80 00 / 91 224 69 00 (ext 5054/3063)
> e-mail: eandres at cnio.es Fax: (+34) 91 224 69 76
> Unidad de Bioinformática Bioinformatics Unit
> Centro Nacional de Investigaciones Oncológicas
> C.P.: 28029 Zip Code: 28029
> C/. Melchor Fernández Almagro, 3 Madrid (Spain)
> http://bioinfo.cnio.es http://bioinfo.cnio.es/people/eandres
> ===================================================
>
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
More information about the Bioconductor
mailing list