[BioC] summarizeOverlaps colData does NOT contain countBam() summary data

Cook, Malcolm MEC at stowers.org
Fri Jan 31 16:19:39 CET 2014


Valerie and other Genomics,

I read in ?summarizeOverlaps that 

     'colData' is a DataFrame with columns of 'object' (class of
     'reads') and 'records' (length of 'reads'). When 'reads' is a
     BamFile or BamFileList the 'colData' holds the output of a call to
     'countBam' with columns of 'records' (total records in file),
     'nucleotides' and 'mapped'. The number in 'mapped' is the number
     of records returned when 'isUnmappedQuery=FALSE' in the
     'ScanBamParam'.

and also,

     ## When the reads are Bam files, the 'colData' contains summary 
     ## information from a call to countBam().

However, I find this NOT to be true.  Viz (in a fresh R session)

>library(GenomicRanges)
>example(summarizeOverlaps)
....
> colData(se)
DataFrame with 2 rows and 0 columns

# but yet:

> do.call(rbind,lapply(fls,countBam))
                  space start end width              file records nucleotides
sm_treated1.bam      NA    NA  NA    NA   sm_treated1.bam    1800       80260
sm_untreated1.bam    NA    NA  NA    NA sm_untreated1.bam    1800      135000

Can you advise?

Thanks!

~ Malcolm Cook 
Computational Biology / Shilatifard Lab - Stowers Institute for Medical Research - Kansas City


PS

> sessionInfo()
R version 3.0.2 (2013-09-25)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8       LC_NAME=C                  LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] parallel  stats     graphics  grDevices datasets  utils     methods   base     

other attached packages:
 [1] edgeR_3.4.2                                limma_3.18.9                               DESeq_1.14.0                               lattice_0.20-24                            locfit_1.5-9.1                             TxDb.Dmelanogaster.UCSC.dm3.ensGene_2.10.1 GenomicFeatures_1.14.2                     AnnotationDbi_1.24.0                       Biobase_2.22.0                             pasillaBamSubset_0.0.8                     BiocInstaller_1.12.0                       Rsamtools_1.14.2                           Biostrings_2.30.1                         
[14] GenomicRanges_1.14.4                       XVector_0.2.0                              IRanges_1.20.6                             BiocGenerics_0.8.0                        

loaded via a namespace (and not attached):
 [1] annotate_1.40.0    biomaRt_2.18.0     bitops_1.0-6       BSgenome_1.30.0    DBI_0.2-7          genefilter_1.44.0  geneplotter_1.40.0 grid_3.0.2         RColorBrewer_1.0-5 RCurl_1.95-4.1     RSQLite_0.11.4     rtracklayer_1.22.2 splines_3.0.2      stats4_3.0.2       survival_2.37-7    tools_3.0.2        XML_3.98-1.1       xtable_1.7-1       zlibbioc_1.8.0    
>



More information about the Bioconductor mailing list