[BioC] Strand information for dbSNP packages

Hervé Pagès hpages at fhcrc.org
Wed Feb 29 02:35:50 CET 2012


Hi Alex,

On 02/28/2012 03:50 PM, Valerie Obenchain wrote:
> Hi Alex,
>
> On 02/28/2012 03:03 AM, Alex Gutteridge wrote:
>> I notice the GRanges returned by the dbSNP packages have strand '*'.
>> Does anyone know how safe am I in assuming that the variant alleles
>> also given by the package actually correspond to the '+' strand?

Yes the alleles actually always correspond to the + strand. I should
clarify this in the man page. dbSNP reports the strand and alleles for
a given SNP and the alleles they give is relative to the reported
strand. However, when the SNPlocs.Hsapiens.dbSNP.20110815 package is
made the alleles for SNPs on the minus strand are complemented so
they correspond to the '+' strand. So all SNPs are considered to be
on the + strand and everything is reported with respect to that strand.

Hope this helps and sorry for the confusion.

Cheers,
H.

>
> The dbSNP packages don't contain any strand information so it isn't safe
> to assume one strand or the other.
>>
>> I ask this in the context of trying to use predictCoding in the
>> VariantAnnotations package to find coding SNPs. For SNPs in genes on
>> the '-' strand I have found that I have to complement the alleles
>> given by dbSNP to get the correct result. I just want to make sure
>> that assuming the alleles are from the '+' strand is a reasonable
>> assumption in the vast majority (>99%) of cases.
>
> This does not seem right, I'll look into it.
>
> Valerie
>>
>> I realise from my reading of the SNPlocs.Hsapiens.dbSNP.20110815
>> manual that some SNPs will be incorrect anyway (it mentions ~0.1% of
>> SNPs not mapping to the reference at all), that level of failure is
>> acceptable, but anything higher would be a worry.
>>
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor


-- 
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpages at fhcrc.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319



More information about the Bioconductor mailing list