[Bioc-devel] VariantAnnotation::isDelins() ??
Robert Castelo
robert.castelo at upf.edu
Tue Feb 10 15:29:46 CET 2015
hi,
in the VariantAnnotation package, the help of the functions for
identifying variant types such as SNVs, insertions,
deletions, transitions, and structural rearrangements gives the
following definitions:
• isSNV: Reference and alternate alleles are both a single
nucleotide long.
• isInsertion: Reference allele is a single nucleotide and the
alternate allele is greater (longer) than a single nucleotide
and the first nucleotide of the alternate allele matches the
reference.
• isDeletion: Alternate allele is a single nucleotide and the
reference allele is greater (longer) than a single nucleotide
and the first nucleotide of the reference allele matches the
alternate.
• isIndel: The variant is either a deletion or insertion as
determined by ‘isDeletion’ and ‘isInsertion’.
• isSubstition: Reference and alternate alleles are the same
length (1 or more nucleotides long).
• isTransition: Reference and alternate alleles are both a
single nucleotide long. The reference-alternate pair
interchange is of either two-ring purines (A <-> G) or
one-ring pyrimidines (C <-> T).
however, unless I'm missing something here, these definitions do not
cover the indels that involve the the insertion or deletion involving
more than one, respectively, reference or alternate nucleotide. this
could be an example of what i'm trying to say:
library(VariantAnnotation)
vr <- VRanges(seqnames = rep("chr1", times=5),
ranges = IRanges(seq(1, 10, by=20),
seq(1, 10, by=20)+c(1, 1, 2, 2, 3)),
ref = c("T", "A", "A", "AC", "AC"),
alt = c("C", "T", "AC", "AT", "ACC"),
refDepth = c(5, 10, 5, 10, 5),
altDepth = c(7, 6, 7, 6, 7),
totalDepth = c(12, 17, 12, 17, 12),
sampleNames = letters[1:5])
isSNV(vr)
## [1] TRUE TRUE FALSE FALSE FALSE
isIndel(vr)
## [1] FALSE FALSE TRUE FALSE FALSE
isSubstitution(vr)
## [1] TRUE TRUE FALSE TRUE FALSE
note that the last variant does not evaluate as true for any of the
three possibilities. after looking for variant definitions, i have found
that the Human Genome Variation Society (HGVS) describes this as a
deletion followed by an insertion and calls it "indel" or delins" (it's
unclear to me whether they use that interchangeably), see the link here:
http://www.hgvs.org/mutnomen/recs-DNA.html#indel
the only other site I could quickly find with Google, where some
specific definition is given is the site of the software SnpEff, which
calls it "MIXED", a "Multiple-nucleotide and an InDel":
http://snpeff.sourceforge.net/SnpEff_manual.html
I would suggest that VariantAnnotation should try to identify this type
of variant. following the HGVS recommendations, could we maybe have a
function for it called isDelins() ??
cheers,
robert.
More information about the Bioc-devel
mailing list