[BioC] Retrieving SNP rs IDs using biomaRt getBM()

Sonia Shah ucacshs at live.ucl.ac.uk
Thu Nov 22 10:15:24 CET 2012


Thanks for the replies.

I am looking for something to query 1000's of locations, so will give 
SNPlocs a try.

Cheers,
Sonia


On 21/11/2012 21:04, Hervé Pagès wrote:
> Hi Sonia,
>
> If you have human SNPs, an alternative is to use a SNPlocs package:
>
>   library(SNPlocs.Hsapiens.dbSNP.20120608)
>   ch19_snps <- getSNPlocs("ch19", as.GRanges=TRUE)
>   mypos <- c(45412079, 45415640)
>   idx <- match(mypos, start(ch19_snps))
>   rsids <- mcols(ch19_snps)$RefSNP_id[idx]
>
> This would scale well if you had a lot of positions (e.g. hundreds of
> thousands) but you need to work 1 chromosome at a time.
>
> Note that the rs IDs are stored without the "rs" prefix in the GRanges
> object returned by getSNPlocs():
>
>   > rsids
>   [1] "7412"   "445925"
>
> Cheers,
> H.
>
>
> On 11/21/2012 10:14 AM, Sonia Shah [guest] wrote:
>>
>> I have a list of chromosomal positions for which I would like to 
>> retrieve SNP rs IDs (if present at these locations). I used the 
>> following code to try and get the rs IDs at 2 locations.
>>
>> getBM(
>> attributes=c("refsnp_id","chr_name","chrom_start"),
>> filters=c("chr_name","chrom_start","chrom_end"), 
>> values=list(c(19,19), c(45412079,45415640), c(45412079,45415640)), mart)
>>
>> I get back the rs IDs for these 2 locations but also get a list of 
>> snps that lie within these 2 positions (a total of 82 SNPs are 
>> returned with this query).
>>
>> How do I query the database to return only the rs ids at the 2 
>> specified chromosomal positions?
>>
>> Many thanks
>> Sonia
>>
>>   -- output of sessionInfo():
>>
>> R version 2.11.1 (2010-05-31)
>> x86_64-redhat-linux-gnu
>>
>> locale:
>>   [1] LC_CTYPE=en_US.iso885915       LC_NUMERIC=C
>>   [3] LC_TIME=en_US.iso885915        LC_COLLATE=en_US.iso885915
>>   [5] LC_MONETARY=C                  LC_MESSAGES=en_US.iso885915
>>   [7] LC_PAPER=en_US.iso885915       LC_NAME=C
>>   [9] LC_ADDRESS=C                   LC_TELEPHONE=C
>> [11] LC_MEASUREMENT=en_US.iso885915 LC_IDENTIFICATION=C
>>
>> attached base packages:
>> [1] stats     graphics  grDevices utils     datasets  methods base
>>
>> other attached packages:
>> [1] biomaRt_2.4.0
>>
>> loaded via a namespace (and not attached):
>> [1] RCurl_1.91-1 tools_2.11.1 XML_3.9-4
>>
>> -- 
>> Sent via the guest posting facility at bioconductor.org.
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives: 
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>

-- 
Sonia Shah
UCL Genetics Institute
Room 212, Darwin Building
Gower Street
WC1E 6BT

external: +44 (0) 20 7679 2212
           +44 (0) 20 7679 4392

internal: 32212/34392



More information about the Bioconductor mailing list