[BioC] need additional sanity checks in TabixFile or readVcf

Martin Morgan mtmorgan at fhcrc.org
Tue Oct 1 16:32:35 CEST 2013


On 10/01/2013 07:25 AM, Jeremy Leipzig wrote:
> I think the Bioconductor workflows for variant studies are great, but
> TabixFile or readVcf might need some additional sanity checks.
>
> If a careless user (not me of course - I'm asking for a friend) mistakenly
> assumes a VCF is the "TabixFile" and the vcf.bgz.idx is the Tabix index,
> they will get this cryptic error when attempting to load certain ranges
> from a VCF file.
>
>> tab<-TabixFile(vcfFile,paste(vcfFile,"bgz.tbi",sep="."))
>> vcf<-readVcf(tab,genome="b37",param=some_param)
> Error in lapply(names(vcf[[1]]), function(elt) { :
>    error in evaluating the argument 'X' in selecting a method for
> function 'lapply': Error in vcf[[1]] : subscript out of bounds
>
>
> This will lead to a lot of futile debugging with the assumption that either
> the ranges or the vcf itself are corrupt, since loading the vcf without
> ranges will not rely on Tabix.
>
> The Tabix setup is especially prone to error since compressing the VCF just
> seems like an intermediate step and is often performed in the shell instead
> of within R. Something that tests whether a Tabix index is really
> associated with a "TabixFile" would be helpful.

Hi Jeremy -- I'm not 100% sure what your 'friend' did, could you get him / her 
to illustrate starting with say

   system.file(package="VariantAnnotation", "extdata", "ex2.vcf")

? Thanks, Martin

>
> Thanks,
> Jeremy
>
> 	[[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>


-- 
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793



More information about the Bioconductor mailing list