[BioC] VariantAnnotation: Specifying 'seqinfo' at import with 'readVcf'
Valerie Obenchain
vobencha at fhcrc.org
Tue Sep 24 18:31:00 CEST 2013
Hi Julian,
On 09/24/2013 02:29 AM, Julian Gehring wrote:
> Hi,
>
> Is there a direct way to specifiy the 'seqinfo' of a genome for the
> import of a VCF file using 'readVcf'?
I think the question is how to read in a subset of chromosomes/positions
from a vcf file without an accompanying tabix index. You can't.
readVcf() requires an index when subsets are defined by
chromosome/position. However you can read in subsets defined by INFO
and/or GENO fields without an index.
Approaches:
(1) create index with ?indexTabix and specify 'which' in ScanVcfParam
(2) use ?filterVcf to write out a new file of records of interest
> I'm aware that one can change it
> with the 'seqinfo' method afterwards, but for large VCF files this can
> take a significant amount of time.
What operation is taking along time? Subsetting the VCF object by
chromosome?
Valerie
>
> An alternative would be to sneak it in by the 'which' arguments, such as:
>
> readVcf(file, genome, ScanVcfParam(which = as(seq_info, "GRanges")))
>
> but this requires the file to be indexed beforehand.
>
> Best wishes
> Julian
>
More information about the Bioconductor
mailing list