[BioC] eayRNASeq with Ensemble GRCh37 help
Nicolas Delhomme
delhomme at embl.de
Tue Sep 17 11:05:33 CEST 2013
Hej Aki Hoji!
You can indeed ignore the warnings. The error is this:
> The number of conditions: 0 did not correspond to the number of samples: 1
For using the DESeq output, you need to precise the conditions, see the ?easyRNASeq help page and the easyRNASeq and DESeq vignettes (e.g. vignette("easyRNASeq")) for more details on the arguments and how to use DESeq. Even if you provide a condition, easyRNASeq is bound to fail again as DESeq can't work with a single sample.
Finally, note that easyRNASeq as of now only returns a DESeq and not DESeq2 output (i.e. a CountDataSet and not a SummarizedExperiment). This is planned for next release, planned early October.
Best,
Nico
---------------------------------------------------------------
Nicolas Delhomme
Genome Biology Computational Support
European Molecular Biology Laboratory
Tel: +49 6221 387 8310
Email: nicolas.delhomme at embl.de
Meyerhofstrasse 1 - Postfach 10.2209
69102 Heidelberg, Germany
---------------------------------------------------------------
On 16 Sep 2013, at 20:17, Aki Hoji wrote:
> Hi,
>
> I've been trying to generate an output file for DESeq2 by easyRNASeq. An input file is a BAM generated by Tophat2/Bowtie2 with Ensemble GRCh37.72 which was a part of Illumina's iGenome package. I followed the overview and samples of easyRNASeq in a BioC mailing list and fired up a following;
>
> testcount<-easyRNASeq(filesDirectory=getwd(), organism="Hsapiens", chr.sizes="auto", readLength=100L, annotationMethod="gtf", annotationFile="Ensemble.gtf", count="exons", outputFormat="DESeq", filenames="4673Bsorted.bam")
>
> Then I got this error;
>
> Checking arguments...
> Fetching annotations...
> Read 2280612 records
> Error in easyRNASeq(filesDirectory = getwd(), organism = "Hsapiens", chr.sizes = "auto", :
> The number of conditions: 0 did not correspond to the number of samples: 1
> In addition: Warning messages:
> 1: In easyRNASeq(filesDirectory = getwd(), organism = "Hsapiens", chr.sizes = "auto", :
> You enforce UCSC chromosome conventions, however the provided chromosome size list is not compliant. Correcting it.
> 2: In .Method(..., deparse.level = deparse.level) :
> number of columns of result is not a multiple of vector length (arg 1)
> 3: In easyRNASeq(filesDirectory = getwd(), organism = "Hsapiens", chr.sizes = "auto", :
> There are 966272 features/exons defined in your annotation that overlap! This implies that some reads will be counted more than once! Is that really what you want?
> 4: In easyRNASeq(filesDirectory = getwd(), organism = "Hsapiens", chr.sizes = "auto", :
> You enforce UCSC chromosome conventions, however the provided annotation is not compliant. Correcting it.
>
> As far as I can tell, I am not really enforcing the UCSC chromosome convention, and chr.sizes could be set to auto since the BAM file is used. I am getting stuck at this point and any help/pointer will be really appreciated.
>
> Thanks.
>
> AH
>
>> sessionInfo()
> R version 3.0.1 (2013-05-16)
> Platform: x86_64-apple-darwin10.8.0 (64-bit)
>
> locale:
> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
>
> attached base packages:
> [1] parallel stats graphics grDevices utils datasets methods base
>
> other attached packages:
> [1] easyRNASeq_1.6.0 ShortRead_1.18.0 latticeExtra_0.6-26
> [4] RColorBrewer_1.0-5 Rsamtools_1.12.4 DESeq_1.12.1
> [7] lattice_0.20-23 locfit_1.5-9.1 BSgenome_1.28.0
> [10] GenomicRanges_1.12.5 Biostrings_2.28.0 IRanges_1.18.3
> [13] edgeR_3.2.4 limma_3.16.7 biomaRt_2.16.0
> [16] Biobase_2.20.1 genomeIntervals_1.16.0 BiocGenerics_0.6.0
> [19] intervals_0.14.0 BiocInstaller_1.10.3
>
> loaded via a namespace (and not attached):
> [1] annotate_1.38.0 AnnotationDbi_1.22.6 bitops_1.0-6
> [4] DBI_0.2-7 genefilter_1.42.0 geneplotter_1.38.0
> [7] grid_3.0.1 hwriter_1.3 RCurl_1.95-4.1
> [10] RSQLite_0.11.4 splines_3.0.1 stats4_3.0.1
> [13] survival_2.37-4 tools_3.0.1 XML_3.95-0.2
> [16] xtable_1.7-1 zlibbioc_1.6.0
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
More information about the Bioconductor
mailing list