[BioC] Best practices to find intersection among variants

Michael Lawrence lawrence.michael at gene.com
Tue Aug 26 03:01:33 CEST 2014


Just to clarify, you'll need to get a VRanges via

vr <- as(vcf, "VRanges")

Michael


On Mon, Aug 25, 2014 at 5:13 PM, Julian Gehring <julian.gehring at embl.de>
wrote:

> Hi Marco,
>
> Essentially, you want to find a intersection with both the locus and the
> ref/alt alleles matching.  The 'VariantAnnotation' has the 'match' function
> which does this (matching the alt allele).  Here is some of my code for
> getting the common variants from two 'VRanges' objects:
>
>   intersectVRanges <- function(x, y) {
>       m = match(x, y)
>       s1 = x[!is.na(m)]
>       s2 = y[m[!is.na(m)]]
>       res = list(s1, s2)
>       return(res)
>   }
>
> Best
> Julian
>
>
> On 25.08.2014 16:35, Blanchette, Marco wrote:
>
>> I am very very new to working with variants, maybe this question is very
>> basic but I need to get kickstarted a bit�
>>
>> Just ran an analysis to find the common variation in a set of lab strains
>> used in house in the haploid genomes of S. pombe. I used GATK best
>> practices and I am at the stage where I have filtered variants and I would
>> like to find the common ones. My first intuition for this was to turn to R
>> (tired to running Java command line�) and I fumbled on VariantAnnotation
>> which uses my favorite object, GRanges, under the hood. So I should be good.
>>
>> However, I can�t seem to figure out how to create intersect and I am a
>> bit nervous as I want to find the common variants, not just the location (I
>> can see that I could extract the Granges and do overlap operation on them,
>> but then, I run into the danger of losing the variant information�).
>>
>> Could anyone provide a simple workflow to get me started? I could provide
>> a basic starting code but it would only be restricted to loading the
>> VariantAnnotation package and reading two vcf files� so I figured that I
>> would not add it�
>>
>> Thanks a bunch and sorry for the very vcf newby question.
>>
>>
>> --  Marco Blanchette, Ph.D.
>> Genomic Scientist
>> Stowers Institute for Medical Research
>> 1000 East 50th Street
>> Kansas City MO 64110
>> www.stowers.org
>>
>> Tel: 816-926-4071
>> Cell: 816-726-8419
>> Fax: 816-926-2018
>>
>>         [[alternative HTML version deleted]]
>>
>>
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives: http://news.gmane.org/gmane.
>> science.biology.informatics.conductor
>>
>>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.
> science.biology.informatics.conductor
>

	[[alternative HTML version deleted]]



More information about the Bioconductor mailing list