[BioC] seqlevels in VCF objects
Murat Tasan
mmuurr at gmail.com
Tue Jan 15 04:49:40 CET 2013
ah, great, thanks!
somehow i was intuitively expecting that i'd have to set seqlevels on
the 'info' sub-object, rather than the VCF object itself.
thanks much!
-m
On Mon, Jan 14, 2013 at 9:25 PM, Valerie Obenchain <vobencha at fhcrc.org> wrote:
> On 01/14/13 15:40, Murat Tasan wrote:
>>
>> hi all - i've encountered an annoying problem, and i'd like to avoid
>> read/writing the many GBs required for the blunt-force solution...
>>
>> the 1000 Genomes project provides a collection of VCF files providing
>> the genotypes for all found variants.
>> after reading in the VCF files (via vcf<- readVcf(...)), i have an
>> VCF object, but the info(vcf) object reveals the chromosome names
>> (i.e. 'seqlevels') are "1", "2", ..., "X", "Y".
>> Bioconductor's TxDb.Hsapiens.UCSC.hg19.knownGene object, however, uses
>> the UCSC standard prefix for chromosome names: "chr1", "chr2", etc.
>>
>> in trying to subset(...) or predictCoding(...) the VCF data against
>> the genome objects (including BSgenome.Hsapiens.UCSC.hg19) this causes
>> an obvious failure.
>>
>> i tried re-setting the seqlevels of the VCF 'info' object like so
>> (thinking the seqnames factor just indexes back on the seqlevels as a
>> key):
>>
>> seqlevels(vcf at info)<- sprintf("chr%s", seqlevels(vcf at info))
>>
>> but this doesn't seem to have any effect.
>>
>> any idea on how to make this bulk change of seqnames for data in VCF
>> objects?
>
>
>
> There are examples of how to do this in the vignette and on the ?VCF and
> ?predictCoding man pages.
> In ?VCF there is a section in the examples dedicated to "Renaming seqlevels
> and subsetting".
> Renaming all seqlevels in a VCF can be done with renameSeqlevels() or
> seqlevels() and subsetting a VCF on
> seqlevels can be done with keepSeqlevels().
>
>> fl <- system.file("extdata", "ex2.vcf", package="VariantAnnotation")
>> vcf <- readVcf(fl, "hg19")
>> seqlevels(vcf)
> [1] "20"
>> seqlevels(vcf) <- "chr20"
>> seqlevels(vcf)
> [1] "chr20"
>
>
> Valerie
>
>
>>
>> cheers,
>>
>> -m
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>
More information about the Bioconductor
mailing list