[BioC] Excessive memory requirements of PING or bug?

Dan Tenenbaum dtenenba at fhcrc.org
Fri May 18 20:56:53 CEST 2012


I'm cc'ing one of the PING maintainers who can perhaps shed more light on this.
Dan


On Thu, May 17, 2012 at 2:55 PM, Lars Hennig <Lars.Hennig at slu.se> wrote:
> Dear PING maintainers,
>
> Running PING with the example from the vignette works fine, but segmentReads causes a "cannot allocate memory block of size 68719476735.9 Gb" error when using my own ChIP-seq sample data. (16Mio paired end reads mapped with bowtie). This is an Arabidopsis sample (genome size = 130MB).
> Using a sample of 100000 of our own reads runs smoothly again, 2.5 Mio crash with a similarly high memory request as mentioned above. Including snowfall or not has no effect.
>
> Is there a way to trick PING into processing more than some few 100000 reads with "normal" memory (I have 48 Gb available). If PING really has a very high memory need, this could be mentioned in the documentation.
>
> Thank you very much,
>
> Lars
>
> Script:
>
> library(ShortRead)
>
> reads <- readAligned("reads_sorted.bam", type="BAM")
> reads <- reads[!is.na(position(reads))]
> reads <- reads[chromosome(reads) %in% c("Chr4")]
>
> #reads <- reads[1:100000]
>
> library(PING)
> library(snowfall)
> sfInit(parallel=TRUE,cpus=4)
> sfLibrary(PING)
>
>
> reads <- as(reads,"RangesList")
> reads <- as(reads,"RangedData")
> reads <- as(reads,"GenomeData")
>
> seg <-segmentReads(reads, minReads=5, maxLregion=1200,minLregion=80, jitter=T)
>
>
>
>
>> traceback()
> 2: .Call("segReadsAll", data, dataC, start, end, as.integer(jitter),
>       paraSW, as.integer(maxStep), as.integer(minLregion), PACKAGE = "PING")
> 1: segmentReads(reads_gd, minReads = 5, maxLregion = 1200, minLregion = 80,
>       jitter = T)
>
>
>> sessionInfo()
> R version 2.15.0 (2012-03-30)
> Platform: x86_64-pc-linux-gnu (64-bit)
>
> locale:
> [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
> [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
> [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
> [7] LC_PAPER=C                 LC_NAME=C
> [9] LC_ADDRESS=C               LC_TELEPHONE=C
> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
>
> other attached packages:
> [1] snowfall_1.84       snow_0.3-9          PING_1.0.0
> [4] chipseq_1.6.0       ShortRead_1.14.3    latticeExtra_0.6-19
> [7] RColorBrewer_1.0-5  Rsamtools_1.8.4     lattice_0.20-6
> [10] BSgenome_1.24.0     Biostrings_2.24.1   GenomicRanges_1.8.6
> [13] IRanges_1.14.3      BiocGenerics_0.2.0
>
> loaded via a namespace (and not attached):
> [1] Biobase_2.16.0      biomaRt_2.12.0      bitops_1.0-4.1
> [4] GenomeGraphs_1.16.0 grid_2.15.0         hwriter_1.3
> [7] RCurl_1.91-1        stats4_2.15.0       tools_2.15.0
> [10] XML_3.9-4           zlibbioc_1.2.0
>
>
> Dr. Lars Hennig
> Professor of Genetics
> Swedish University of Agricultural Sciences
> Uppsala BioCenter
> Department of Plant Biology and Forest Genetics
> PO-Box 7080
> SE-75007 Uppsala, Sweden
> Lars.Hennig at vbsg.slu.se
> Tel. +46 18 67 3326
> Fax  +46 18 67 3389
>
> Visiting address:
> Uppsala BioCenter
> Almas Allé 5
> SE-75651 Uppsala, Sweden
> Room A-489
>
>
>        [[alternative HTML version deleted]]
>
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor



More information about the Bioconductor mailing list