[BioC] dbSNP strand information
Alex Gutteridge
alexg at ruggedtextile.com
Tue Feb 28 23:23:20 CET 2012
I notice the GRanges returned by the dbSNP packages have strand '*'.
Does anyone know how safe am I in assuming that the variant alleles also
given by the package actually correspond to the '+' strand? This seems
to be the case for the 20 or so I have checked manually, but maybe I
have just been lucky.
I ask this in the context of trying to use predictCoding in the
VariantAnnotations package to find coding SNPs. For SNPs in genes on the
'-' strand I have found that I have to complement the alleles given by
dbSNP to get the correct result. I just want to make sure that assuming
the alleles are from the '+' strand is a reasonable assumption in the
vast majority (say >99%) of cases.
I realise from my reading of the SNPlocs.Hsapiens.dbSNP.20110815 manual
that some SNPs will be incorrect anyway (it mentions ~0.1% of SNPs not
mapping to the reference at all), that level of failure is acceptable,
but anything higher would be a worry.
--
Alex Gutteridge
More information about the Bioconductor
mailing list