[BioC] New SNPlocs data package for Human (dbSNP BUILD 131)
Hervé Pagès
hpages at fhcrc.org
Thu Apr 29 21:30:46 CEST 2010
Hi Lin,
The new package is ready and available thru biocLite():
> library(BSgenome)
> available.SNPs()
BioC_mirror = http://www.bioconductor.org
Change using chooseBioCmirror().
[1] "SNPlocs.Hsapiens.dbSNP.20071016" "SNPlocs.Hsapiens.dbSNP.20080617"
[3] "SNPlocs.Hsapiens.dbSNP.20090506" "SNPlocs.Hsapiens.dbSNP.20100427"
Only the source package is available for now. Binaries for Windows and
Mac will follow soon, but you can still install the source package on
these OS by using the 'type="source"' option in biocLite().
dbSNP is growing fast and this new package (BUILD 131) is about 50%
bigger than the previous one (BUILD 130):
SNPlocs.Hsapiens.dbSNP.20090506 | SNPlocs.Hsapiens.dbSNP.20100427
BUILD 130 | BUILD 131
Ref. genome: NCBI Build 36.1 | Ref. genome: GRCh37
Compatible with: UCSC hg18 | Compatible with: UCSC hg19
| (except for chrM)
size: 81MB | size: 123MB
nb of SNPs: 11846489 | nb of SNPs: 17434679
See the full description here
http://bioconductor.org/packages/2.6/data/annotation/html/SNPlocs.Hsapiens.dbSNP.20100427.html
for why SNPs on the mitochondrion chromosome are not compatible with
hg19 chrM.
I tried to improve the man page of this new package over the previous
SNPlocs pkgs. In particular now it briefly explains how SNPs from dbSNP
were filtered (impossible to keep everything, dbSNP contains much more
stuff than just SNPs), and the examples section has more examples.
Also a new feature is that you can now extract SNPs from 1 or more
chromosomes at the same time and get the result in a GRanges object.
Just use as.GRanges=TRUE when calling getSNPlocs().
Please make sure you update BSgenome to >= 1.16.1 (available in
about 12 hours thru biocLite) before you try to inject those SNPs
in BSgenome.Hsapiens.UCSC.hg19.
Let me know if you have any questions. Feedback from SNPlocs users
will be greatly appreciated.
Cheers,
H.
Lin Tang wrote:
> Hi, Dear Hervé
>
> The new dbSNP build is available and the human genome sequence is now hg19. Would you please updated the R package for the SNP locations?
>
> Thank you very much!
>
> Regards,
> Lin
>
> Lin Tang, Ph.D.
> Scientist II, Informatics | Sequenom Inc.
> T: 1 858 202 9106 | F: 1 858 202 9084
>
>
>
>
> THIS EMAIL MESSAGE IS FOR THE SOLE USE OF THE INTENDED RECIPIENT(S) AND MAY CONTAIN CONFIDENTIAL INFORMATION. ANY UNAUTHORIZED REVIEW, USE, DISCLOSURE OR DISTRIBUTION IS PROHIBITED. IF YOU ARE NOT THE INTENDED RECIPIENT, PLEASE CONTACT THE SENDER BY REPLY EMAIL AND DESTROY ALL COPIES OF THE ORIGINAL MESSAGE.
--
Hervé Pagès
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M2-B876
P.O. Box 19024
Seattle, WA 98109-1024
E-mail: hpages at fhcrc.org
Phone: (206) 667-5791
Fax: (206) 667-1319
More information about the Bioconductor
mailing list