[BioC] Retrieving SNP rs IDs using biomaRt getBM()
Hervé Pagès
hpages at fhcrc.org
Wed Nov 21 22:04:30 CET 2012
Hi Sonia,
If you have human SNPs, an alternative is to use a SNPlocs package:
library(SNPlocs.Hsapiens.dbSNP.20120608)
ch19_snps <- getSNPlocs("ch19", as.GRanges=TRUE)
mypos <- c(45412079, 45415640)
idx <- match(mypos, start(ch19_snps))
rsids <- mcols(ch19_snps)$RefSNP_id[idx]
This would scale well if you had a lot of positions (e.g. hundreds of
thousands) but you need to work 1 chromosome at a time.
Note that the rs IDs are stored without the "rs" prefix in the GRanges
object returned by getSNPlocs():
> rsids
[1] "7412" "445925"
Cheers,
H.
On 11/21/2012 10:14 AM, Sonia Shah [guest] wrote:
>
> I have a list of chromosomal positions for which I would like to retrieve SNP rs IDs (if present at these locations). I used the following code to try and get the rs IDs at 2 locations.
>
> getBM(
> attributes=c("refsnp_id","chr_name","chrom_start"),
> filters=c("chr_name","chrom_start","chrom_end"), values=list(c(19,19), c(45412079,45415640), c(45412079,45415640)), mart)
>
> I get back the rs IDs for these 2 locations but also get a list of snps that lie within these 2 positions (a total of 82 SNPs are returned with this query).
>
> How do I query the database to return only the rs ids at the 2 specified chromosomal positions?
>
> Many thanks
> Sonia
>
> -- output of sessionInfo():
>
> R version 2.11.1 (2010-05-31)
> x86_64-redhat-linux-gnu
>
> locale:
> [1] LC_CTYPE=en_US.iso885915 LC_NUMERIC=C
> [3] LC_TIME=en_US.iso885915 LC_COLLATE=en_US.iso885915
> [5] LC_MONETARY=C LC_MESSAGES=en_US.iso885915
> [7] LC_PAPER=en_US.iso885915 LC_NAME=C
> [9] LC_ADDRESS=C LC_TELEPHONE=C
> [11] LC_MEASUREMENT=en_US.iso885915 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] stats graphics grDevices utils datasets methods base
>
> other attached packages:
> [1] biomaRt_2.4.0
>
> loaded via a namespace (and not attached):
> [1] RCurl_1.91-1 tools_2.11.1 XML_3.9-4
>
> --
> Sent via the guest posting facility at bioconductor.org.
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>
--
Hervé Pagès
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024
E-mail: hpages at fhcrc.org
Phone: (206) 667-5791
Fax: (206) 667-1319
More information about the Bioconductor
mailing list