[BioC] summarizeOverlaps colData does NOT contain countBam() summary data
Martin Morgan
mtmorgan at fhcrc.org
Sat Feb 1 16:39:31 CET 2014
Thanks Malcom. The documentation at this point is not accurate; there's a parameter count.mapped.reads=TRUE that needs to be set; it _is_ documented on ?BamFile and has been clarified in devel (where summarizeOverlaps is in the new package 'GenomicAlignments'). Martin
----- Malcolm Cook <MEC at stowers.org> wrote:
> Valerie and other Genomics,
>
> I read in ?summarizeOverlaps that
>
> 'colData' is a DataFrame with columns of 'object' (class of
> 'reads') and 'records' (length of 'reads'). When 'reads' is a
> BamFile or BamFileList the 'colData' holds the output of a call to
> 'countBam' with columns of 'records' (total records in file),
> 'nucleotides' and 'mapped'. The number in 'mapped' is the number
> of records returned when 'isUnmappedQuery=FALSE' in the
> 'ScanBamParam'.
>
> and also,
>
> ## When the reads are Bam files, the 'colData' contains summary
> ## information from a call to countBam().
>
> However, I find this NOT to be true. Viz (in a fresh R session)
>
> >library(GenomicRanges)
> >example(summarizeOverlaps)
> ....
> > colData(se)
> DataFrame with 2 rows and 0 columns
>
> # but yet:
>
> > do.call(rbind,lapply(fls,countBam))
> space start end width file records nucleotides
> sm_treated1.bam NA NA NA NA sm_treated1.bam 1800 80260
> sm_untreated1.bam NA NA NA NA sm_untreated1.bam 1800 135000
>
> Can you advise?
>
> Thanks!
>
> ~ Malcolm Cook
> Computational Biology / Shilatifard Lab - Stowers Institute for Medical Research - Kansas City
>
>
> PS
>
> > sessionInfo()
> R version 3.0.2 (2013-09-25)
> Platform: x86_64-unknown-linux-gnu (64-bit)
>
> locale:
> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] parallel stats graphics grDevices datasets utils methods base
>
> other attached packages:
> [1] edgeR_3.4.2 limma_3.18.9 DESeq_1.14.0 lattice_0.20-24 locfit_1.5-9.1 TxDb.Dmelanogaster.UCSC.dm3.ensGene_2.10.1 GenomicFeatures_1.14.2 AnnotationDbi_1.24.0 Biobase_2.22.0 pasillaBamSubset_0.0.8 BiocInstaller_1.12.0 Rsamtools_1.14.2 Biostrings_2.30.1
> [14] GenomicRanges_1.14.4 XVector_0.2.0 IRanges_1.20.6 BiocGenerics_0.8.0
>
> loaded via a namespace (and not attached):
> [1] annotate_1.40.0 biomaRt_2.18.0 bitops_1.0-6 BSgenome_1.30.0 DBI_0.2-7 genefilter_1.44.0 geneplotter_1.40.0 grid_3.0.2 RColorBrewer_1.0-5 RCurl_1.95-4.1 RSQLite_0.11.4 rtracklayer_1.22.2 splines_3.0.2 stats4_3.0.2 survival_2.37-7 tools_3.0.2 XML_3.98-1.1 xtable_1.7-1 zlibbioc_1.8.0
> >
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
More information about the Bioconductor
mailing list