[Bioc-devel] VariantAnnotation::isDelins() ??

Valerie Obenchain vobencha at fredhutch.org
Thu Feb 12 00:15:10 CET 2015


Thanks Robert! It's been added to 1.13.29.

Valerie

On 02/11/2015 02:36 AM, Robert Castelo wrote:
> sure, i'm attaching the patch created from a fresh checkout of the trunk
> this morning. in principle, all required bits are there and it builds
> and checks without errors and warnings.
>
> cheers,
>
> robert.
>
> On 02/10/2015 07:37 PM, Valerie Obenchain wrote:
>> Hi Robert,
>>
>> This sounds like a good addition. I'll put it on the TODO. If you need
>> this immediately I'd be happy to accept a patch (with unit tests).
>>
>> Valerie
>>
>>
>>
>> On 02/10/2015 06:29 AM, Robert Castelo wrote:
>>> hi,
>>>
>>> in the VariantAnnotation package, the help of the functions for
>>> identifying variant types such as SNVs, insertions,
>>> deletions, transitions, and structural rearrangements gives the
>>> following definitions:
>>>
>>>
>>> • isSNV: Reference and alternate alleles are both a single
>>> nucleotide long.
>>>
>>> • isInsertion: Reference allele is a single nucleotide and the
>>> alternate allele is greater (longer) than a single nucleotide
>>> and the first nucleotide of the alternate allele matches the
>>> reference.
>>>
>>> • isDeletion: Alternate allele is a single nucleotide and the
>>> reference allele is greater (longer) than a single nucleotide
>>> and the first nucleotide of the reference allele matches the
>>> alternate.
>>>
>>> • isIndel: The variant is either a deletion or insertion as
>>> determined by ‘isDeletion’ and ‘isInsertion’.
>>>
>>> • isSubstition: Reference and alternate alleles are the same
>>> length (1 or more nucleotides long).
>>>
>>> • isTransition: Reference and alternate alleles are both a
>>> single nucleotide long. The reference-alternate pair
>>> interchange is of either two-ring purines (A <-> G) or
>>> one-ring pyrimidines (C <-> T).
>>>
>>>
>>> however, unless I'm missing something here, these definitions do not
>>> cover the indels that involve the the insertion or deletion involving
>>> more than one, respectively, reference or alternate nucleotide. this
>>> could be an example of what i'm trying to say:
>>>
>>> library(VariantAnnotation)
>>>
>>> vr <- VRanges(seqnames = rep("chr1", times=5),
>>> ranges = IRanges(seq(1, 10, by=20),
>>> seq(1, 10, by=20)+c(1, 1, 2, 2, 3)),
>>> ref = c("T", "A", "A", "AC", "AC"),
>>> alt = c("C", "T", "AC", "AT", "ACC"),
>>> refDepth = c(5, 10, 5, 10, 5),
>>> altDepth = c(7, 6, 7, 6, 7),
>>> totalDepth = c(12, 17, 12, 17, 12),
>>> sampleNames = letters[1:5])
>>>
>>> isSNV(vr)
>>> ## [1] TRUE TRUE FALSE FALSE FALSE
>>> isIndel(vr)
>>> ## [1] FALSE FALSE TRUE FALSE FALSE
>>> isSubstitution(vr)
>>> ## [1] TRUE TRUE FALSE TRUE FALSE
>>>
>>> note that the last variant does not evaluate as true for any of the
>>> three possibilities. after looking for variant definitions, i have found
>>> that the Human Genome Variation Society (HGVS) describes this as a
>>> deletion followed by an insertion and calls it "indel" or delins" (it's
>>> unclear to me whether they use that interchangeably), see the link here:
>>>
>>> http://www.hgvs.org/mutnomen/recs-DNA.html#indel
>>>
>>> the only other site I could quickly find with Google, where some
>>> specific definition is given is the site of the software SnpEff, which
>>> calls it "MIXED", a "Multiple-nucleotide and an InDel":
>>>
>>> http://snpeff.sourceforge.net/SnpEff_manual.html
>>>
>>> I would suggest that VariantAnnotation should try to identify this type
>>> of variant. following the HGVS recommendations, could we maybe have a
>>> function for it called isDelins() ??
>>>
>>>
>>>
>>> cheers,
>>>
>>> robert.
>>>
>>> _______________________________________________
>>> Bioc-devel at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>
>>
>



More information about the Bioc-devel mailing list