[BioC] edgeR - paired samples with multifactorial design - errors

Fri Apr 25 19:00:08 CEST 2014

Hi Preethy,

You need to re-read that section of edgeR, and in particular look at the 
patient column of the revised targets frame.

Best,

Jim

On 4/25/2014 12:46 PM, Preethy Venkat Ram wrote:
> Hi Jim,
>
> Thanks for the reply. I've tried that -  with a slighlty modified 
> code. I am sorry. But, I'm getting error again.
>
> Here:
>
> pair=factor(rep(c(1:45), each=2)).Treat=factor( rep(c("Before", 
> "After"),45), levels=c("Before", "After"))
> Phenotype=factor(rep(c(1, 1, 2, 2, 1, 1, 1, 2, 1, 1, 1, 2, 1,  2, 2, 
> 2, 2, 2, 2, 2, 1, 1, 1, 1, 1, 1, 2, 2, 1,  2, 2, 2, 2, 2, 2, 2, 2, 1, 
> 2, 1, 2, 2, 1, 2, 2), each=2),levels=c("1","2"))
> design=model.matrix(~Phenotype+Phenotype:pair+Phenotype:Treat)
> colnames(design)
> counts.DGEList<-DGEList(counts, group=Treat)
> y<-calcNormFactors(counts.DGEList)
> y<-estimateCommonDisp(y, design)
> y<-estimateGLMTrendedDisp(y, design)
>
>
> The error message here:
>
> > y<-estimateGLMTrendedDisp(y, design)
> Error in return(NA, ntags) : multi-argument returns are not permitted
> In addition: Warning message:
> In estimateGLMTrendedDisp.default(y = y$counts, design = design,  :
>   No residual df: cannot estimate dispersion
>
>
> One reason I can see is that the number of columns in my design matrix 
>  is "92"  as they are more replicates/patients than in the 3.5 
> example. And there are only "90" columns in my "count" matrix.
>
> But, do you have any idea how can I solve this ?
>
> Thanks,
> Preethy
>
>
>
>
>
> On Fri, Apr 25, 2014 at 6:45 PM, James W. MacDonald <jmacdon at uw.edu 
> <mailto:jmacdon at uw.edu>> wrote:
>
>     Hi Preethy,
>
>     This experiment is very similar to the example in part 3.5 of the
>     edgeR User's guide, starting on page 31.
>
>     Best,
>
>     Jim
>
>
>
>     On 4/25/2014 11:29 AM, Preethy Venkat Ram wrote:
>
>         Hi Devon,
>
>         Thanks for the replies both on biostars and here.  Sorry for
>         crossposting.  Both
>         mailing lists and discussion have been very helpful to me.
>
>         But, I rarely see replies from BioC package maintainers at
>         biostars.
>
>         All the samples are paired and I have them all correct - I
>         mean none of
>         them are empty
>         I tried different designs. But ending up with the same error
>         message.
>         What I want to do: Getting DE genes between
>         (Phenotype1.Before-Phenotype1.After) &
>         (Phenotype2.Before-Phenotype2.After)
>
>         pdata here:
>
>         pair=rep(c(1:45), each=2)
>         Treat=rep(c("Before", "After"),45)
>         Phenotype=rep(c(1, 1, 2, 2, 1, 1, 1, 2, 1, 1, 1, 2, 1,  2, 2,
>         2, 2, 2, 2,
>         2, 1, 1, 1, 1, 1, 1, 2, 2, 1,  2, 2, 2, 2, 2, 2, 2, 2, 1, 2,
>         1, 2, 2, 1, 2,
>         2), each=2)
>         pdata<-data.frame(pair,Treat,Phenotype)
>
>
>         Preethy
>
>
>
>         On Fri, Apr 25, 2014 at 4:53 PM, Devon Ryan <dpryan at dpryan.com
>         <mailto:dpryan at dpryan.com>> wrote:
>
>             Hi Preethy,
>
>             You likely want:
>
>             design=model.matrix(~pair+Treat:Phenotype, data=pdata)
>
>             If that still yields the error, then you'll need to share
>             "pdata" or
>             "design". Also, please don't crosspost on both this list
>             and biostars (
>             https://www.biostars.org/p/98907/), it duplicates the
>             community effort.
>
>             Devon
>
>
>             --
>             Devon Ryan, Ph.D.
>             Email: dpryan at dpryan.com <mailto:dpryan at dpryan.com>
>             Laboratory for Molecular and Cellular Cognition
>             German Centre for Neurodegenerative Diseases (DZNE)
>             Ludwig-Erhard-Allee 2
>             53175 Bonn
>             Germany
>             <devon.ryan at dzne.de <mailto:devon.ryan at dzne.de>>
>
>
>
>             On Fri, Apr 25, 2014 at 11:15 AM, Preethy [guest]
>             <guest at bioconductor.org <mailto:guest at bioconductor.org>>wrote:
>
>                 Hi All,
>
>                 I had been trying to do DE analysis of my RNAseq
>                 experiment using edgeR
>                 and am having some isssues. The details of the
>                 Experiment and the R code I
>                 tried below:
>
>                 (a) Paired experimental design with 45 pairs
>                 (b) Treatment: "Before" and "After"
>                 (c) Phenotype: 1 & 2
>                 Aim: Look for DE genes between Phenotype 1 and 2 upon
>                 treatment taking
>                 into account the paired design
>
>                 The R code tried:
>
>                 library(edgeR)
>                 counts<-read.delim(file="counts.dat",header=T)
>                 pair=factor(pdata$pair)
>                 Treat=factor( pdata$treat)
>                 Phenotype=factor(pdata$pheno)
>                 group<-paste(Treat,Phenotype,sep=".")
>                 design=model.matrix(~pair+Treat:Phenotype, data=counts)
>                 counts.DGEList<-DGEList(counts, group=group)
>                 y<-calcNormFactors(counts.DGEList)
>                 y<-estimateCommonDisp(y, design)
>                 y<-estimateGLMTrendedDisp(y, design)
>
>
>                 Error message I get:
>
>                 Error in glmFit.default(y, design = design, dispersion
>                 = dispersion,
>                 offset = offset,  :
>                    Design matrix not of full rank.  The following
>                 coefficients not
>                 estimable:
>                   TreatBefore:Phenotype1 TreatBefore:Phenotype2
>
>
>                 Any idea to solve this out?
>
>                 Thanks,
>                 Preethy
>
>                   -- output of sessionInfo():
>
>                 R version 3.1.0 (2014-04-10)
>                 Platform: x86_64-pc-linux-gnu (64-bit)
>
>                 locale:
>                   [1] LC_CTYPE=fi_FI.UTF-8       LC_NUMERIC=C
>                   [3] LC_TIME=en_GB.UTF-8  LC_COLLATE=en_GB.UTF-8
>                   [5] LC_MONETARY=fi_FI.UTF-8  LC_MESSAGES=en_GB.UTF-8
>                   [7] LC_PAPER=en_GB.UTF-8       LC_NAME=C
>                   [9] LC_ADDRESS=C LC_TELEPHONE=C
>                 [11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C
>
>                 attached base packages:
>                 [1] stats     graphics  grDevices utils datasets
>                  methods   base
>
>                 other attached packages:
>                 [1] edgeR_3.4.2   limma_3.18.13
>
>
>                 --
>                 Sent via the guest posting facility at
>                 bioconductor.org <http://bioconductor.org>.
>
>                 _______________________________________________
>                 Bioconductor mailing list
>                 Bioconductor at r-project.org
>                 <mailto:Bioconductor at r-project.org>
>                 https://stat.ethz.ch/mailman/listinfo/bioconductor
>                 Search the archives:
>                 http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>
>                 [[alternative HTML version deleted]]
>
>
>         _______________________________________________
>         Bioconductor mailing list
>         Bioconductor at r-project.org <mailto:Bioconductor at r-project.org>
>         https://stat.ethz.ch/mailman/listinfo/bioconductor
>         Search the archives:
>         http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>
>     -- 
>     James W. MacDonald, M.S.
>     Biostatistician
>     University of Washington
>     Environmental and Occupational Health Sciences
>     4225 Roosevelt Way NE, # 100
>     Seattle WA 98105-6099
>
>

-- 
James W. MacDonald, M.S.
Biostatistician
University of Washington
Environmental and Occupational Health Sciences
4225 Roosevelt Way NE, # 100
Seattle WA 98105-6099