[BioC] Strand information for dbSNP packages
Alex Gutteridge
alexg at ruggedtextile.com
Tue Feb 28 12:03:11 CET 2012
I notice the GRanges returned by the dbSNP packages have strand '*'.
Does anyone know how safe am I in assuming that the variant alleles also
given by the package actually correspond to the '+' strand?
I ask this in the context of trying to use predictCoding in the
VariantAnnotations package to find coding SNPs. For SNPs in genes on the
'-' strand I have found that I have to complement the alleles given by
dbSNP to get the correct result. I just want to make sure that assuming
the alleles are from the '+' strand is a reasonable assumption in the
vast majority (>99%) of cases.
I realise from my reading of the SNPlocs.Hsapiens.dbSNP.20110815 manual
that some SNPs will be incorrect anyway (it mentions ~0.1% of SNPs not
mapping to the reference at all), that level of failure is acceptable,
but anything higher would be a worry.
--
Alex Gutteridge
More information about the Bioconductor
mailing list