[BioC] edgeR - paired samples with multifactorial design - errors
James W. MacDonald
jmacdon at uw.edu
Fri Apr 25 19:00:08 CEST 2014
Hi Preethy,
You need to re-read that section of edgeR, and in particular look at the
patient column of the revised targets frame.
Best,
Jim
On 4/25/2014 12:46 PM, Preethy Venkat Ram wrote:
> Hi Jim,
>
> Thanks for the reply. I've tried that - with a slighlty modified
> code. I am sorry. But, I'm getting error again.
>
> Here:
>
> pair=factor(rep(c(1:45), each=2)).Treat=factor( rep(c("Before",
> "After"),45), levels=c("Before", "After"))
> Phenotype=factor(rep(c(1, 1, 2, 2, 1, 1, 1, 2, 1, 1, 1, 2, 1, 2, 2,
> 2, 2, 2, 2, 2, 1, 1, 1, 1, 1, 1, 2, 2, 1, 2, 2, 2, 2, 2, 2, 2, 2, 1,
> 2, 1, 2, 2, 1, 2, 2), each=2),levels=c("1","2"))
> design=model.matrix(~Phenotype+Phenotype:pair+Phenotype:Treat)
> colnames(design)
> counts.DGEList<-DGEList(counts, group=Treat)
> y<-calcNormFactors(counts.DGEList)
> y<-estimateCommonDisp(y, design)
> y<-estimateGLMTrendedDisp(y, design)
>
>
> The error message here:
>
> > y<-estimateGLMTrendedDisp(y, design)
> Error in return(NA, ntags) : multi-argument returns are not permitted
> In addition: Warning message:
> In estimateGLMTrendedDisp.default(y = y$counts, design = design, :
> No residual df: cannot estimate dispersion
>
>
> One reason I can see is that the number of columns in my design matrix
> is "92" as they are more replicates/patients than in the 3.5
> example. And there are only "90" columns in my "count" matrix.
>
> But, do you have any idea how can I solve this ?
>
> Thanks,
> Preethy
>
>
>
>
>
> On Fri, Apr 25, 2014 at 6:45 PM, James W. MacDonald <jmacdon at uw.edu
> <mailto:jmacdon at uw.edu>> wrote:
>
> Hi Preethy,
>
> This experiment is very similar to the example in part 3.5 of the
> edgeR User's guide, starting on page 31.
>
> Best,
>
> Jim
>
>
>
> On 4/25/2014 11:29 AM, Preethy Venkat Ram wrote:
>
> Hi Devon,
>
> Thanks for the replies both on biostars and here. Sorry for
> crossposting. Both
> mailing lists and discussion have been very helpful to me.
>
> But, I rarely see replies from BioC package maintainers at
> biostars.
>
> All the samples are paired and I have them all correct - I
> mean none of
> them are empty
> I tried different designs. But ending up with the same error
> message.
> What I want to do: Getting DE genes between
> (Phenotype1.Before-Phenotype1.After) &
> (Phenotype2.Before-Phenotype2.After)
>
> pdata here:
>
> pair=rep(c(1:45), each=2)
> Treat=rep(c("Before", "After"),45)
> Phenotype=rep(c(1, 1, 2, 2, 1, 1, 1, 2, 1, 1, 1, 2, 1, 2, 2,
> 2, 2, 2, 2,
> 2, 1, 1, 1, 1, 1, 1, 2, 2, 1, 2, 2, 2, 2, 2, 2, 2, 2, 1, 2,
> 1, 2, 2, 1, 2,
> 2), each=2)
> pdata<-data.frame(pair,Treat,Phenotype)
>
>
> Preethy
>
>
>
> On Fri, Apr 25, 2014 at 4:53 PM, Devon Ryan <dpryan at dpryan.com
> <mailto:dpryan at dpryan.com>> wrote:
>
> Hi Preethy,
>
> You likely want:
>
> design=model.matrix(~pair+Treat:Phenotype, data=pdata)
>
> If that still yields the error, then you'll need to share
> "pdata" or
> "design". Also, please don't crosspost on both this list
> and biostars (
> https://www.biostars.org/p/98907/), it duplicates the
> community effort.
>
> Devon
>
>
> --
> Devon Ryan, Ph.D.
> Email: dpryan at dpryan.com <mailto:dpryan at dpryan.com>
> Laboratory for Molecular and Cellular Cognition
> German Centre for Neurodegenerative Diseases (DZNE)
> Ludwig-Erhard-Allee 2
> 53175 Bonn
> Germany
> <devon.ryan at dzne.de <mailto:devon.ryan at dzne.de>>
>
>
>
> On Fri, Apr 25, 2014 at 11:15 AM, Preethy [guest]
> <guest at bioconductor.org <mailto:guest at bioconductor.org>>wrote:
>
> Hi All,
>
> I had been trying to do DE analysis of my RNAseq
> experiment using edgeR
> and am having some isssues. The details of the
> Experiment and the R code I
> tried below:
>
> (a) Paired experimental design with 45 pairs
> (b) Treatment: "Before" and "After"
> (c) Phenotype: 1 & 2
> Aim: Look for DE genes between Phenotype 1 and 2 upon
> treatment taking
> into account the paired design
>
> The R code tried:
>
> library(edgeR)
> counts<-read.delim(file="counts.dat",header=T)
> pair=factor(pdata$pair)
> Treat=factor( pdata$treat)
> Phenotype=factor(pdata$pheno)
> group<-paste(Treat,Phenotype,sep=".")
> design=model.matrix(~pair+Treat:Phenotype, data=counts)
> counts.DGEList<-DGEList(counts, group=group)
> y<-calcNormFactors(counts.DGEList)
> y<-estimateCommonDisp(y, design)
> y<-estimateGLMTrendedDisp(y, design)
>
>
> Error message I get:
>
> Error in glmFit.default(y, design = design, dispersion
> = dispersion,
> offset = offset, :
> Design matrix not of full rank. The following
> coefficients not
> estimable:
> TreatBefore:Phenotype1 TreatBefore:Phenotype2
>
>
> Any idea to solve this out?
>
> Thanks,
> Preethy
>
> -- output of sessionInfo():
>
> R version 3.1.0 (2014-04-10)
> Platform: x86_64-pc-linux-gnu (64-bit)
>
> locale:
> [1] LC_CTYPE=fi_FI.UTF-8 LC_NUMERIC=C
> [3] LC_TIME=en_GB.UTF-8 LC_COLLATE=en_GB.UTF-8
> [5] LC_MONETARY=fi_FI.UTF-8 LC_MESSAGES=en_GB.UTF-8
> [7] LC_PAPER=en_GB.UTF-8 LC_NAME=C
> [9] LC_ADDRESS=C LC_TELEPHONE=C
> [11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] stats graphics grDevices utils datasets
> methods base
>
> other attached packages:
> [1] edgeR_3.4.2 limma_3.18.13
>
>
> --
> Sent via the guest posting facility at
> bioconductor.org <http://bioconductor.org>.
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> <mailto:Bioconductor at r-project.org>
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>
> [[alternative HTML version deleted]]
>
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org <mailto:Bioconductor at r-project.org>
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>
> --
> James W. MacDonald, M.S.
> Biostatistician
> University of Washington
> Environmental and Occupational Health Sciences
> 4225 Roosevelt Way NE, # 100
> Seattle WA 98105-6099
>
>
--
James W. MacDonald, M.S.
Biostatistician
University of Washington
Environmental and Occupational Health Sciences
4225 Roosevelt Way NE, # 100
Seattle WA 98105-6099
More information about the Bioconductor
mailing list