[Bioc-devel] [VariantAnnotation] segfault thrown when scan header for VCF files out of GATK pipeline
Martin Morgan
mtmorgan at fhcrc.org
Tue Feb 19 20:58:10 CET 2013
On 02/19/2013 11:44 AM, Tengfei Yin wrote:
> Hi ,
>
> I am working on the vcf files from GATK pipeline(usually around
> 400Mb/file), but I encountered some problems importing vcf files in R
> using VariantAnnotation package, this has been confirmed in both released
> and devel version of VariantAnnotation, I only provide sessionInfo for
> devel-branch. I may not have the permission to provide the data here, I
> just post here first to see if there is an obvious answer I don't know yet,
> if any data and reproducible example are needed, I could work on that.
>
> I don't know it's an issue in R, samtools or GATK... and I have no problem
> importing vcf files from just bcftools pipeline.
>
> If you need any details like command pipeline and version of other
> software, please let me know. Thanks a lot.
Probably a problem in Rsamtools C code, depending just on the header of the VCF
file, maybe triggered by garbage collection. You could debug further yourself by
setting
~/.R/Makevars:
CFLAGS="-g -O0"
and then installing Rsamtools from source
biocLite("Rsamtools", type="source")
and finally running a minimal test script with either
R -d valgrind -f test.R
or under the gdb
R -d gdb -f test.R
(gdb) run
segfault occurs, then
(gdb) bt
to get a back trace. Feel free to share the output of either with me off-list,
or if possible to share just the header data from the vcf.
Martin
>
> Tengfei
>
>> scanVcfHeader("~/GATK-64/GATK_AUTOMATION/VCF/Adams.vcf")
> Adams.vcf
>> hdr = scanVcfHeader("~//GATK-64/GATK_AUTOMATION/VCF/Adams.vcf")
>
> *** caught segfault ***
> address (nil), cause 'memory not mapped'
>
> Traceback:
> 1: .Call(.scan_bcf_header, .extptr(file))
> 2: scanBcfHeader(bf)
> 3: scanBcfHeader(bf)
> 4: (function (file, mode) { bf <- open(BcfFile(file, character(0),
> ...)) on.exit(close(bf)) scanBcfHeader(bf)})(dots[[1L]][[1L]])
> 5: mapply(FUN = f, ..., SIMPLIFY = FALSE)
> 6: .Method(..., f = f)
> 7: eval(expr, envir, enclos)
> 8: eval(.dotsCall, env)
> 9: eval(.dotsCall, env)
> 10: standardGeneric("Map")
> 11: Map(function(file, mode) { bf <- open(BcfFile(file, character(0),
> ...)) on.exit(close(bf)) scanBcfHeader(bf)}, file, ...)
> 12: scanBcfHeader(file, ...)
> 13: scanBcfHeader(file, ...)
> 14: scanVcfHeader("~/GATK-64/GATK_AUTOMATION/VCF/Adams.vcf")
> 15: scanVcfHeader("~/GATK-64/GATK_AUTOMATION/VCF/Adams.vcf")
>
>
> My sessioninfo
> R Under development (unstable) (2013-02-17 r61981)
> Platform: x86_64-unknown-linux-gnu (64-bit)
>
> locale:
> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
> [7] LC_PAPER=C LC_NAME=C
> [9] LC_ADDRESS=C LC_TELEPHONE=C
> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] parallel stats graphics grDevices utils datasets methods
> [8] base
>
> other attached packages:
> [1] VariantAnnotation_1.5.38 Rsamtools_1.11.16 Biostrings_2.27.11
>
> [4] GenomicRanges_1.11.29 IRanges_1.17.32 BiocGenerics_0.5.6
>
>
> loaded via a namespace (and not attached):
> [1] AnnotationDbi_1.21.10 Biobase_2.19.2 biomaRt_2.15.0
> [4] bitops_1.0-5 BSgenome_1.27.1 DBI_0.2-5
> [7] GenomicFeatures_1.11.11 RCurl_1.95-3 RSQLite_0.11.2
> [10] rtracklayer_1.19.9 stats4_3.0.0 tools_3.0.0
> [13] XML_3.95-0.1 zlibbioc_1.5.0
>
>
--
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109
Location: Arnold Building M1 B861
Phone: (206) 667-2793
More information about the Bioc-devel
mailing list