[BioC] New SNPlocs data package for Human (dbSNP BUILD 130)

Hervé Pagès hpages at fhcrc.org
Sat Jun 6 00:16:21 CEST 2009


Hi SNPlocs users,

I've added SNPlocs.Hsapiens.dbSNP.20090506 to the BioC repo (in BioC release
only, source tarball only, but that's just for now). It contains the SNP
locations and alleles for Homo sapiens extracted from dbSNP BUILD 130 (the
latest dbSNP build).

 From within R-2.9:

   > library(BSgenome)
   > available.SNPs()
   [1] "SNPlocs.Hsapiens.dbSNP.20071016" "SNPlocs.Hsapiens.dbSNP.20080617"
   [3] "SNPlocs.Hsapiens.dbSNP.20090506"

Install with:

   source("http://bioconductor.org/biocLite.R")
   biocLite("SNPlocs.Hsapiens.dbSNP.20090506")

Then:

   > library(SNPlocs.Hsapiens.dbSNP.20090506)
   > ?SNPlocs.Hsapiens.dbSNP.20090506  # now there is a man page!
   > getSNPcount()
     chr1   chr2   chr3   chr4   chr5   chr6   chr7   chr8   chr9  chr10  chr11
   920233 933616 789121 798603 706109 760249 655873 612367 496064 583240 577300
    chr12  chr13  chr14  chr15  chr16  chr17  chr18  chr19  chr20  chr21  chr22
   558759 427010 365742 331501 354239 316396 322866 268235 323041 160580 187392
     chrX   chrY
   391414   6539

Overall, that's 10% more SNPs than in the previous build (BUILD 129).

Note that, like with the previous builds, there are still different RefSNP
IDs that are mapped to the same location:

   > chr1_snps <- getSNPlocs("chr1")
   > sum(duplicated(chr1_snps$loc))
   [1] 950

Twice more than with BUILD 129!

   > which(duplicated(chr1_snps$loc))[1:10]
    [1]  3142  3365  7835  8161  8327 10638 12113 14060 14640 15538
   > chr1_snps[chr1_snps$loc == chr1_snps$loc[3142], ]
        RefSNP_id alleles_as_ambig     loc
   3141   3766175                D 1476802
   3142  59009700                W 1476802

Please let me know if you find any problem with this new package.

Cheers,
H.

-- 
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M2-B876
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpages at fhcrc.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319



More information about the Bioconductor mailing list