[BioC] applyPileups invalid times argument error

Martin Morgan mtmorgan at fhcrc.org
Wed Mar 27 18:08:04 CET 2013


On 03/27/2013 09:07 AM, Mark Dunning wrote:
> Hi all,
>
> I am trying to get per-base counts over a series of ranges using the
> pileupAsGRanges function. I'm only interesting in certain regions for
> each bam file, so have written a script to extract the pileup from
> each bam file for the regions of interest to that bam (the gr2 object
> below, which changes according to bam file).
>
> However, I'm having issues creating pileup from one of my ranges
> objects. To my eye, it looks to be a valid GRanges object
>
>> gr2
> GRanges with 12 ranges and 4 metadata columns:
>          seqnames                 ranges strand |                     Sample
>             <Rle>              <IRanges>  <Rle> |                   <factor>
>     F1_W    chr18 [ 48604746,  48604747]      * |            EN-438-11_Blood
>     F2_W     chr2 [125320840, 125320841]      * |            EN-438-11_Blood
>     F3_W    chr17 [  7577114,   7577115]      * | EN-454-12_EXP0628_TF_Blood
>     F4_W     chr1 [ 27099124,  27099125]      * | EN-454-12_EXP0628_TF_Blood
>     F5_W     chr9 [120474984, 120474985]      * | EN-454-12_EXP0628_TF_Blood
>     F6_W     chr1 [248039414, 248039415]      * | EN-454-12_EXP0628_TF_Blood
>     F7_W    chr19 [ 11130352,  11130353]      * |            EN-460-12_Blood
>     F8_W    chr19 [ 11141514,  11141515]      * |            EN-460-12_Blood
>     F9_W     chr8 [ 89058912,  89058913]      * |            EN-460-12_Blood
>    F10_W    chr20 [ 23016541,  23016542]      * |  EN-64-10_EXP0628_TF_Blood
>    F11_W    chr10 [118225601, 118225602]      * |  EN-64-10_EXP0628_TF_Blood
>    F12_W    chr17 [  7578188,   7578189]      * |  EN-81-10_EXP0628_TF_Blood
>              Barcode Ref_base Alt_base
>          <character> <factor> <factor>
>     F1_W      FLD283        G        T
>     F2_W      FLD283        T        G
>     F3_W      FLD283        C        T
>     F4_W      FLD283        G        A
>     F5_W      FLD283        C        A
>     F6_W      FLD283        G        A
>     F7_W      FLD283        T        C
>     F8_W      FLD283        A        G
>     F9_W      FLD283        C        T
>    F10_W      FLD283        A        G
>    F11_W      FLD283        A        T
>    F12_W      FLD283        C        A
>
>
>> pileupAsGRanges(bam=bams[85], region=gr2,maxDepth=.Machine$integer.max,minBaseQuality=30, minMapQuality=30)
> Error: applyPileups: invalid 'times' argument

I looked at the code for pileupAsGRanges and saw that it provides a pileupFun 
that includes things like rep(0, N).  when parsing the pileups. My guess is that 
the error comes from evaluating

 > rep(0, integer())
Error in rep(0, integer()) : invalid 'times' argument

where integer() is, e.g., the width of a zero-length GRanges instance

 > width(GRanges())
integer(0)

I'm not sure where your zero length GRanges might be coming from, but perhaps 
related to the warnings below or to some other aspect of your data, such as 
seqlevels not represented by any ranges

 > xx = GRanges(c("A", "B"), IRanges(1, 5))[1]
 > xx  ## "B" has no ranges, so...
GRanges with 1 range and 0 metadata columns:
       seqnames    ranges strand
          <Rle> <IRanges>  <Rle>
   [1]        A    [1, 5]      *
   ---
   seqlengths:
     A  B
    NA NA
 > seqlevels(xx) = "A"  ## re-level
 > xx
GRanges with 1 range and 0 metadata columns:
       seqnames    ranges strand
          <Rle> <IRanges>  <Rle>
   [1]        A    [1, 5]      *
   ---
   seqlengths:
     A
    NA


> In addition: Warning messages:
> 1: In start(regions):end(regions) :
>    numerical expression has 12 elements: only the first used
> 2: In start(regions):end(regions) :
>    numerical expression has 12 elements: only the first used
>
>
> However, the pileup seems to work for each element in the GRanges
> object individually if I call them one-by-one without the error or the
> warning.
>
> e.g.
>
>> pileupAsGRanges(bam=bams[85], region=gr2[1,],maxDepth=.Machine$integer.max,minBaseQuality=30, minMapQuality=30)
> GRanges with 2 ranges and 7 metadata columns:
>        seqnames             ranges strand |         A         C         G
>           <Rle>          <IRanges>  <Rle> | <integer> <integer> <integer>
>    [1]    chr17 [7577539, 7577539]      + |         0         0        27
>    [2]    chr17 [7577540, 7577540]      + |         0         0        27
>                T         N     depth
>        <integer> <integer> <numeric>
>    [1]         0         0        27
>    [2]         0         0        27
>                                                            bam
>                                                    <character>
>    [1] bamfiles/SLX-6363.FLD0187.1093.s_1.bwa.homo_sapiens.bam
>    [2] bamfiles/SLX-6363.FLD0187.1093.s_1.bwa.homo_sapiens.bam
>    ---
>    seqlengths:
>     chr17
>        NA
>
> I'm struggling to see why the pileup using the whole gr2 object as an
> input isn't working.
>
> Thanks a lot,
>
> Mark
>
>> sessionInfo()
> R version 2.15.2 (2012-10-26)
> Platform: x86_64-pc-linux-gnu (64-bit)
>
> locale:
>   [1] LC_CTYPE=en_GB.UTF-8       LC_NUMERIC=C
>   [3] LC_TIME=en_GB.UTF-8        LC_COLLATE=en_GB.UTF-8
>   [5] LC_MONETARY=en_GB.UTF-8    LC_MESSAGES=en_GB.UTF-8
>   [7] LC_PAPER=C                 LC_NAME=C
>   [9] LC_ADDRESS=C               LC_TELEPHONE=C
> [11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
>
> other attached packages:
> [1] gdata_2.12.0         biovizBase_1.6.2     Rsamtools_1.10.2
> [4] Biostrings_2.26.3    GenomicRanges_1.10.7 IRanges_1.16.6
> [7] BiocGenerics_0.4.0
>
> loaded via a namespace (and not attached):
>   [1] AnnotationDbi_1.20.7   Biobase_2.18.0         biomaRt_2.14.0
>   [4] bitops_1.0-5           BSgenome_1.26.1        cluster_1.14.2
>   [7] colorspace_1.2-1       DBI_0.2-5              dichromat_2.0-0
> [10] GenomicFeatures_1.10.2 grid_2.15.2            gtools_2.7.0
> [13] Hmisc_3.10-1           labeling_0.1           lattice_0.20-10
> [16] munsell_0.4            parallel_2.15.2        plyr_1.8
> [19] RColorBrewer_1.0-5     RCurl_1.95-4.1         RSQLite_0.11.2
> [22] rtracklayer_1.18.2     scales_0.2.3           stats4_2.15.2
> [25] stringr_0.6.2          tools_2.15.2           XML_3.95-0.2
> [28] zlibbioc_1.4.0
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>


-- 
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793



More information about the Bioconductor mailing list