[BioC] Problem running summarizeOverlaps()
Martin Morgan
mtmorgan at fhcrc.org
Tue May 20 18:05:51 CEST 2014
On 05/20/2014 05:34 AM, Jessica Perry Hekman wrote:
> On 05/19/2014 09:32 PM, Martin Morgan wrote:
>> On 05/19/2014 06:55 PM, Jessica Perry Hekman wrote:
>>> I am working from
>>>
>>> http://bioconductor.org/packages/release/bioc/vignettes/gage/inst/doc/RNA-seqWorkflow.pdf
>>>
>
>>> gnCnt <- summarizeOverlaps(exByGn, bamfls, mode="Union",
>>> ignore.strand=TRUE, single.end=TRUE, param=param)
>
>> Hi Jessica --
>>
>> I think that summarizeOverlaps is trying to evaluate your counting
>> algorithm in on several different cores, but an error occurs. Try
>> running the commands above, and then immediately before
>> summarizeOverlaps evaluate
>>
>> options(mc.cores=1)
>> gnCnt <- summarizeOverlaps(exByGn, bamfls, mode="Union",
>> ignore.strand=TRUE, single.end=TRUE, param=param)
>>
>> Hopefully this will at least make the error apparent, even if it might
>> still be cryptic.
>>
>> Please be sure to include the output of the command 'sessionInfo()'
>> after you have a problem; here's mine
>
> Ah yes! Very helpful! The error message after I added mc.cores=1 to my script is:
>
> Error: C stack usage is too close to the limit
>
> ...which is indeed much less cryptic. I am still not sure how to fix the
> problem, though!
I haven't seen this error before in the context of summarizeOverlaps, so it's a
bit puzzling. I'd first check that the
fls <- list.files("../../bam/", pattern="fox-readgroups.bam$", full.names=T)
all point to valid bam files, and the bam files have indexes.
You might then try adding a 'yieldSize' argument to the following line, starting
small (e.g., 100000) and moving toward the default (1000000) if the small size
works when calling summarizeOverlaps, or perhaps smaller if it fails.
bamfls <- BamFileList(fls, yieldSize=100000)
Can you provide a little information about your system? It sounds like it's your
own machine, not a server. How much memory?
Probably you'd get a different outcome with a more recent R / Bioconductor, but
I'm not sure whether the error would go away! I have a sense that the problem
with package manager installation is that they or you end up installing
non-default packages into a single system directory, and as a consequence the
directory contains a mix of different Bioconductor releases. A 'better practice'
is probably to
a) remove any existing system-wide R installation and packages
b) install R with only base packages as su, or (as I do) install R as a
regular user (not su) in version-specific directories in your own user file
system, e.g., ~mtmorgan/bin/R-3-1-branch/
c) install any additional packages, via biocLite or otherwise, as a regular
user, following R's prompt to create a version-specific directory in your own
user hierarchy.
Obviously this can be a rats nest of problems, and should only be done
immediately before a big deadline or when you are feeling too productive and
need to scale back ;)
Martin
>
> sessionInfo() output:
>
> R version 3.0.2 (2013-09-25)
> Platform: x86_64-redhat-linux-gnu (64-bit)
>
> locale:
> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
> [9] LC_ADDRESS=C LC_TELEPHONE=C
> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] parallel stats graphics grDevices utils datasets methods
> [8] base
>
> other attached packages:
> [1] leeBamViews_0.99.24
> [2] BSgenome_1.30.0
> [3] Rsamtools_1.14.3
> [4] Biostrings_2.30.1
> [5] TxDb.Hsapiens.UCSC.hg19.knownGene_2.10.1
> [6] GenomicFeatures_1.14.5
> [7] AnnotationDbi_1.24.0
> [8] Biobase_2.22.0
> [9] GenomicRanges_1.14.4
> [10] XVector_0.2.0
> [11] IRanges_1.20.7
> [12] BiocGenerics_0.8.0
>
> loaded via a namespace (and not attached):
> [1] biomaRt_2.18.0 bitops_1.0-6 DBI_0.2-7 RCurl_1.95-4.1
> [5] RSQLite_0.11.4 rtracklayer_1.22.7 stats4_3.0.2 tools_3.0.2
> [9] XML_3.98-1.1 zlibbioc_1.8.0
>
> ...and I should have remembered that I am using an older version of R. What I am
> running is the latest version that my package manager has on offer. Last time I
> installed a more recent version separately from yum, it was a huge annoyance to
> keep the two separate versions on the system. Do you think updating R and
> Bioconductor (which appears to depend on the most recent R in order to upgrade)
> will help?
>
> Thanks very much,
> Jessica
>
--
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109
Location: Arnold Building M1 B861
Phone: (206) 667-2793
More information about the Bioconductor
mailing list