[BioC] need additional sanity checks in TabixFile or readVcf
Valerie Obenchain
vobencha at fhcrc.org
Wed Oct 2 20:22:37 CEST 2013
Thanks for the example.
In 1.7.52 (devel) an error will now be thrown if the supplied file is
not compressed. Ideally we'd confirm the file is compressed when
creating the TabixFile object but there isn't a fast easy way to check
that (not that I know of). The next best thing is to capture the error
when scanTabix() tries to read an uncompressed file. Here is the error
you will see,
> vcf <- readVcf(tab, "hg19",myparam)
Error: scanVcf: scanTabix: read line failed, corrupt or invalid file?
path: /home/vobencha/R/library/VariantAnnotation/extdata/ex2.vcf
Valerie
On 10/01/2013 10:46 AM, Jeremy Leipzig wrote:
> Valerie,
> Thanks. You will get the same cryptic error if you use the wrong file as
> the TabixFile.
>
> vcfFile <- system.file("extdata", "ex2.vcf", package="VariantAnnotation")
> from <- vcfFile
> to <- tempfile()
> compressVcf <- bgzip(from, to)
> idx <- indexTabix(compressVcf, "vcf")
> rng <- GRanges(seqnames="20", ranges=IRanges(start=c(14000,
> 1230000),end=c(15000,1240000)))
> myparam<-ScanVcfParam(which=rng)
>
> #incorrect - the idx file does not match up with the vcf here, it
> matches up with the compressed one
> tab <- TabixFile(vcfFile,idx)
> vcf <- readVcf(tab, "hg19",myparam)
> Error in lapply(names(vcf[[1]]), function(elt) { :
> error in evaluating the argument 'X' in selecting a method for
> function 'lapply': Error in vcf[[1]] : subscript out of bounds
>
> #correct
> tab <- TabixFile(compressVcf,idx)
> vcf <- readVcf(tab, "hg19",myparam)
>
> Regards,
> Jeremy
>
More information about the Bioconductor
mailing list