[BioC] TCC::ERROR: Need the design matrix for GLM
Gordon K Smyth
smyth at wehi.EDU.AU
Fri Apr 18 02:36:23 CEST 2014
Dear Panka,
It seems as if you are just using the TCC package to call methods from the
edgeR package indirectly.
Why not use the edgeR package directly? That would probably be easier and
you would have a more direct understanding of the methods being used.
Your experiment is almost identical to the oral carcinoma case study in
the edgeR User's Guide.
Best wishes
Gordon
> Date: Tue, 15 Apr 2014 13:51:17 +0000
> From: Pankaj Agarwal <p.agarwal at duke.edu>
> To: "bioconductor at r-project.org" <bioconductor at r-project.org>
> Cc: "kadota at bi.a.u-tokyo.ac.jp" <kadota at bi.a.u-tokyo.ac.jp>
> Subject: [BioC] TCC::ERROR: Need the design matrix for GLM.
>
> Hi,
>
> I have a rna-seq data consisting of matched tumor/normal samples from two patients. For normalization of the counts I am following the steps in the TCC vignette section "3.3 Normalization of two-group count data without replicates (paired)". The output from the commands are as follows:
>
>> data=read.delim("count_bt2_iGenomes_Ensembl.tsv")
>
>> head(data)
> A.sorted.bam B.sorted.bam
> ENSG00000000003 2400 1130
> ENSG00000000005 2 3
> ENSG00000000419 1819 575
> ENSG00000000457 1317 1262
> ENSG00000000460 799 1743
> ENSG00000000938 203 41
> C.sorted.bam D.sorted.bam
> ENSG00000000003 12 72
> ENSG00000000005 0 0
> ENSG00000000419 938 1608
> ENSG00000000457 821 1469
> ENSG00000000460 367 800
> ENSG00000000938 33303 16355
>
>> group <- c(1,1,2,2)
>> pair <- c(1,2,1,2)
>> c1 <- data.frame(group=group, pair=pair)
>> colnames(data) <- c("T_BRPC13.1118", "T_BRPC_13.764", "N_DU04_PBMC", "N_DU06_PBMC")
>> tcc <- new("TCC", data, c1)
>> tcc <- calcNormFactors(tcc, norm.method="tmm", test.method="edger", iteration=1, FDR=0.1, floorPDEG=0.05, paired=TRUE)
> TCC::INFO: Calculating normalization factors using DEGES
> TCC::INFO: (iDEGES pipeline : tmm - [ edger - tmm ] X 1 )
> Error in .testByEdger.3(design = design, coef = coef, contrast = contrast) :
> TCC::ERROR: Need the design matrix for GLM.
>
> Reading further for steps needed for edgeR without TCC I saw something related to design and tried it, but got the same error:
>
>> design <- model.matrix(~ group + pair)
>> tcc <- new("TCC", data, c1)
>> tcc <- calcNormFactors(tcc, norm.method="tmm", test.method="edger", iteration=1, FDR=0.1, floorPDEG=0.05, paired=TRUE)
> TCC::INFO: Calculating normalization factors using DEGES
> TCC::INFO: (iDEGES pipeline : tmm - [ edger - tmm ] X 1 )
> Error in .testByEdger.3(design = design, coef = coef, contrast = contrast) :
> TCC::ERROR: Need the design matrix for GLM.
>
> I would appreciate help with understanding the cause of the error.
>
> The output from sessionInfo() and package description is as follows:
>
>> sessionInfo()
> R version 3.0.3 (2014-03-06)
> Platform: x86_64-unknown-linux-gnu (64-bit)
>
> locale:
> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
> [9] LC_ADDRESS=C LC_TELEPHONE=C
> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] stats graphics grDevices utils datasets methods base
>>
>> packageDescription("TCC")
> Package: TCC
> Type: Package
> Title: TCC: Differential expression analysis for tag count data with
> robust normalization strategies
> Version: 1.2.0
> Author: Jianqiang Sun, Tomoaki Nishiyama, Kentaro Shimizu, and Koji
> Kadota
> Maintainer: Jianqiang Sun <wukong at bi.a.u-tokyo.ac.jp>, Tomoaki
> Nishiyama <tomoakin at staff.kanazawa-u.ac.jp>
> Description: This package provides a series of functions for performing
> differential expression analysis from RNA-seq count data using
> robust normalization strategy (called DEGES). The basic idea of
> DEGES is that potential differentially expressed genes or
> transcripts (DEGs) among compared samples should be removed
> before data normalization to obtain a well-ranked gene list
> where true DEGs are top-ranked and non-DEGs are bottom ranked.
> This can be done by performing a multi-step normalization
> strategy (called DEGES for DEG elimination strategy). A major
> characteristic of TCC is to provide the robust normalization
> methods for several kinds of count data (two-group with or
> without replicates, multi-group/multi-factor, and so on) by
> virtue of the use of combinations of functions in other
> sophisticated packages (especially edgeR, DESeq, and baySeq).
> Depends: R (>= 2.15), methods, DESeq, edgeR, baySeq, ROC
> Imports: EBSeq, samr
> Suggests: RUnit, BiocGenerics
> Enhances: snow
> biocViews: HighThroughputSequencing, DifferentialExpression, RNAseq
> License: GPL-2
> Copyright: Authors listed above
> Packaged: 2013-10-15 05:31:33 UTC; biocbuild
> Built: R 3.0.3; ; 2014-03-31 20:00:49 UTC; unix
>
> -- File: /general/installs/R/R-3.0.3/lib64/R/library/TCC/Meta/package.rds
>
> Thank you,
>
> - Pankaj
> --------------------------------------
> Pankaj Agarwal, M.S
> Bioinformatician
> Bioinformatics Shared Resource
> Duke Cancer Institute
> Duke University
> 919-681-6573
> p.agarwal at duke.edu
______________________________________________________________________
The information in this email is confidential and intend...{{dropped:4}}
More information about the Bioconductor
mailing list