[Bioc-devel] VariantAnnotation: Same locus, multiple samples

Michael Lawrence lawrence.michael at gene.com
Fri Dec 5 15:35:07 CET 2014


The two data structures do not encode the same information. Coercion to VCF
forms a rectangular matrix: position+alt by sample. There is no standard
way to encode that a given cell in that matrix is absent, so coercion to
VRanges simply maps each cell to an element. One could imagine using the
"." missing data marker for every geno field, but that's making too many
assumptions. I'm not sure that's the same as an element not existing in a
VRanges.

On Fri, Dec 5, 2014 at 1:18 AM, Julian Gehring <julian.gehring at embl.de>
wrote:

> Hi,
>
> Assume that we have two variants from two samples at the same locus,
> stored in a 'VRanges' or 'VCF' object:
>
>   library(VariantAnnotation)
>
>   vr = VRanges("1", IRanges(c(10, 10), width = 1),
>     ref = c("C", "C"), alt = c("A", "G"),
>     sampleNames = c("S1", "S2"))
>   vcf = as(vr, "VCF")
>
> If we convert the VCF to a VRanges, we now get each variant in each
> patient:
>
>   vr2 = as(vcf, "VRanges")
>
>   length(vr) ## 2
>   length(vr2) ## 4
>
> It seems that the VCF object does not store the information of the
> 'sampleNames' in the first conversion.
>
> Best wishes
> Julian
>
> _______________________________________________
> Bioc-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>

	[[alternative HTML version deleted]]



More information about the Bioc-devel mailing list