[BioC] error message with easyRNASeq use case

Tue Nov 27 20:45:31 CET 2012

On 11/27/2012 2:30 PM, Richard Friedman wrote:
> Dear List,
>
> 	I am working through the easyRNASeq use case.
> (easyRNASeq: an overview Oct 16, 2012, section 7)
> I am working on a Mac so I could not do the alignment
> part of the use case but rather started with bam files
> produced by top hat:
>
>
> ccrfml1:learning_easyRNAseq friedman$ ls
> 490224.bam			easyRNAseqworkingscripts.txt
> 490225.bam			learningRNASeq.docx
> easyRNASeqvignette2.txt
>
> Here is my session record:
>
>> sessionInfo()
> R version 2.15.2 (2012-10-26)
> Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
>
> locale:
> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
>
> attached base packages:
> [1] parallel  stats     graphics  grDevices utils     datasets  methods   base
>
> other attached packages:
>   [1] easyRNASeq_1.4.2       ShortRead_1.16.2       latticeExtra_0.6-19    RColorBrewer_1.0-5
>   [5] lattice_0.20-10        Rsamtools_1.10.2       DESeq_1.10.1           locfit_1.5-8
>   [9] BSgenome_1.26.1        GenomicRanges_1.10.5   Biostrings_2.26.2      IRanges_1.16.4
> [13] edgeR_3.0.4            limma_3.12.1           biomaRt_2.14.0         Biobase_2.18.0
> [17] genomeIntervals_1.14.0 BiocGenerics_0.4.0     intervals_0.13.3
>
> loaded via a namespace (and not attached):
>   [1] annotate_1.34.1      AnnotationDbi_1.20.2 bitops_1.0-4.1       DBI_0.2-5            genefilter_1.38.0
>   [6] geneplotter_1.34.0   grid_2.15.2          hwriter_1.3          RCurl_1.91-1         RSQLite_0.11.1
> [11] splines_2.15.2       stats4_2.15.2        survival_2.36-14     tools_2.15.2         XML_3.9-4
> [16] xtable_1.7-0         zlibbioc_1.2.0
>
>
>> chr.sizes=seqlengths(Hsapiens)
>> chr.sizes
>                   chr1                  chr2                  chr3                  chr4
>              249250621             243199373             198022430             191154276
>                   chr5                  chr6                  chr7                  chr8
>              180915260             171115067             159138663             146364022
>                   chr9                 chr10                 chr11                 chr12
>              141213431             135534747             135006516             133851895
>                  chr13                 chr14                 chr15                 chr16
>              115169878             107349540             102531392              90354753
>                  chr17                 chr18                 chr19                 chr20
>               81195210              78077248              59128983              63025520
>                  chr21                 chr22                  chrX                  chrY
>               48129895              51304566             155270560              59373566
>                   chrM  chr1_gl000191_random  chr1_gl000192_random        chr4_ctg9_hap1
>                  16571                106433                547496                590426
>   chr4_gl000193_random  chr4_gl000194_random         chr6_apd_hap1         chr6_cox_hap2
>                 189789                191469               4622290               4795371
>          chr6_dbb_hap3        chr6_mann_hap4         chr6_mcf_hap5         chr6_qbl_hap6
>                4610396               4683263               4833398               4611984
>         chr6_ssto_hap7  chr7_gl000195_random  chr8_gl000196_random  chr8_gl000197_random
>                4928567                182896                 38914                 37175
>   chr9_gl000198_random  chr9_gl000199_random  chr9_gl000200_random  chr9_gl000201_random
>                  90085                169874                187035                 36148
> chr11_gl000202_random       chr17_ctg5_hap1 chr17_gl000203_random chr17_gl000204_random
>                  40103               1680828                 37498                 81310
> chr17_gl000205_random chr17_gl000206_random chr18_gl000207_random chr19_gl000208_random
>                 174588                 41001                  4262                 92689
> chr19_gl000209_random chr21_gl000210_random        chrUn_gl000211        chrUn_gl000212
>                 159169                 27682                166566                186858
>         chrUn_gl000213        chrUn_gl000214        chrUn_gl000215        chrUn_gl000216
>                 164239                137718                172545                172294
>         chrUn_gl000217        chrUn_gl000218        chrUn_gl000219        chrUn_gl000220
>                 172149                161147                179198                161802
>         chrUn_gl000221        chrUn_gl000222        chrUn_gl000223        chrUn_gl000224
>                 155397                186861                180455                179693
>         chrUn_gl000225        chrUn_gl000226        chrUn_gl000227        chrUn_gl000228
>                 211173                 15008                128374                129120
>         chrUn_gl000229        chrUn_gl000230        chrUn_gl000231        chrUn_gl000232
>                  19913                 43691                 27386                 40652
>         chrUn_gl000233        chrUn_gl000234        chrUn_gl000235        chrUn_gl000236
>                  45941                 40531                 34474                 41934
>         chrUn_gl000237        chrUn_gl000238        chrUn_gl000239        chrUn_gl000240
>                  45867                 39939                 33824                 41933
>         chrUn_gl000241        chrUn_gl000242        chrUn_gl000243        chrUn_gl000244
>                  42152                 43523                 43341                 39929
>         chrUn_gl000245        chrUn_gl000246        chrUn_gl000247        chrUn_gl000248
>                  36651                 38154                 36422                 39786
>         chrUn_gl000249
>                  38502
>
>>   bamfiles=dir(getwd(),pattern="*\\.bam$")
>> bamfiles
> [1] "490224.bam" "490225.bam"
>> rnaSeq<- easyRNASeq(filesDirectory=getwd(),
> +                      organism="Hsapiens",
> +                      chr.sizes=chr.sizes,
> +                      readLength=58L,
> +                      annotationMethod="biomaRt",
> +                      count="exons",
> +                      filenames=bamfiles[1],
> +                      outputFormat="RNAseq"
> +                      )
> Checking arguments...
> Error in easyRNASeq(filesDirectory = getwd(), organism = "Hsapiens", chr.sizes = chr.sizes,  :
>    You must indicate the format of you source files, by setting argument 'format'
>
> COMMENT: I THOUGHT THAT BAM FILES WERE AUTOMATICALLY THE INPUT
> FILE FORMAT,
>
>> rnaSeq<- easyRNASeq(filesDirectory=getwd(),
> +                      organism="Hsapiens",
> +                      chr.sizes=chr.sizes,
> +                      readLength=58L,
> +                      annotationMethod="biomaRt",
> +                      count="exons",
> + 		     format="bam",
> +                      filenames=bamfiles[1],
> +                      outputFormat="RNAseq"
> +                      )
> Checking arguments...
> Error in easyRNASeq(filesDirectory = getwd(), organism = "Hsapiens", chr.sizes = chr.sizes,  :
>    Index files (bai) are required. They are missing for the files: /Documents/clients/Phyllis/learning_easyRNAseq/490224.bam
>
> QUESTION: HOW DI I OBTAIN OR PRODUCE THESE INPUT FILES?

You want indexBam() in Rsamtools. See ?BamFile.

Best,

Jim

>
> Thanks and best wishes,
> Rich
> Richard A. Friedman, PhD
> Associate Research Scientist,
> Biomedical Informatics Shared Resource
> Herbert Irving Comprehensive Cancer Center (HICCC)
> Lecturer,
> Department of Biomedical Informatics (DBMI)
> Educational Coordinator,
> Center for Computational Biology and Bioinformatics (C2B2)/
> National Center for Multiscale Analysis of Genomic Networks (MAGNet)
> Room 824
> Irving Cancer Research Center
> Columbia University
> 1130 St. Nicholas Ave
> New York, NY 10032
> (212)851-4765 (voice)
> friedman at cancercenter.columbia.edu
> http://cancercenter.columbia.edu/~friedman/
>
> In memoriam, Ray Bradbury
>
>
> 	[[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

-- 
James W. MacDonald, M.S.
Biostatistician
University of Washington
Environmental and Occupational Health Sciences
4225 Roosevelt Way NE, # 100
Seattle WA 98105-6099