[BioC] where to get chr_rpts file for dbSNP human 36.3 assembly

shirley zhang shirley0818 at gmail.com
Thu Nov 3 20:11:35 CET 2011


Thanks Sean. Your information is very helpful.

Shirley

On Thu, Nov 3, 2011 at 2:16 PM, Sean Davis <sdavis2 at mail.nih.gov> wrote:
> On Thu, Nov 3, 2011 at 2:05 PM, shirley zhang <shirley0818 at gmail.com> wrote:
>> Dear Herve and Sean,
>>
>> Thanks for your reply.  May I ask one more help from you?
>>
>> Do you know where I can get the list of SNPs (rs# ) mapped to more
>> than 1 location on the reference genome NCBI Build 36.3?
>
> Hi, Shirley.  This level of detail might need to go to NCBI for an
> answer if you REALLY need to use NCBI annotations directly.  That
> said, UCSC does some reannotation before releasing dbSNP on their
> site.  There is a table that described dbSNP exceptions including
> Multiple Locations.  You can download that file here:
>
> http://hgdownload.cse.ucsc.edu/goldenPath/hg18/database/snp130Exceptions.txt.gz
>
> The table format is described here:
>
> http://genome.ucsc.edu/cgi-bin/hgTables?hgta_doSchemaDb=hg18&hgta_doSchemaTable=snp130Exceptions
>
> Sean
>
>
>> Thanks,
>> Shirley
>>
>> On Tue, Nov 1, 2011 at 9:20 PM, Sean Davis <sdavis2 at mail.nih.gov> wrote:
>>> 2011/11/1 shirley zhang <shirley0818 at gmail.com>:
>>>> Dear Hever,
>>>>
>>>> Also, I just checked that there is no liftOver function in the
>>>> rtracklayer package. Is it a different function name?  Thanks, Shirley
>>>>
>>>>> sessionInfo()
>>>> R version 2.11.1 (2010-05-31)
>>>
>>> Hi, Shirley.
>>>
>>> You'll definitely need to update your R.  R was just released and is
>>> now at version 2.14.0.  With the new version of R, you'll get new
>>> versions of packages.  The most recent couple of versions of
>>> rtracklayer include liftover()
>>>
>>> Sean
>>>
>>>
>>>> x86_64-unknown-linux-gnu
>>>>
>>>> locale:
>>>>  [1] LC_CTYPE=en_US.iso885915       LC_NUMERIC=C
>>>>  [3] LC_TIME=en_US.iso885915        LC_COLLATE=en_US.iso885915
>>>>  [5] LC_MONETARY=C                  LC_MESSAGES=en_US.iso885915
>>>>  [7] LC_PAPER=en_US.iso885915       LC_NAME=C
>>>>  [9] LC_ADDRESS=C                   LC_TELEPHONE=C
>>>> [11] LC_MEASUREMENT=en_US.iso885915 LC_IDENTIFICATION=C
>>>>
>>>> attached base packages:
>>>> [1] stats     graphics  grDevices utils     datasets  methods   base
>>>>
>>>> other attached packages:
>>>> [1] rtracklayer_1.8.1 RCurl_1.4-3       bitops_1.0-4.1
>>>>
>>>> loaded via a namespace (and not attached):
>>>> [1] Biobase_2.8.0       Biostrings_2.16.9   BSgenome_1.16.5
>>>> [4] GenomicRanges_1.0.9 IRanges_1.6.15      XML_3.1-1
>>>>
>>>>
>>>> 2011/11/1 shirley zhang <shirley0818 at gmail.com>:
>>>>> Dear Herve,
>>>>>
>>>>> Thanks for your quick response.
>>>>>
>>>>> I need to get the chr position (hg18, build36.3)  for a huge list of
>>>>> SNPs with rs#. As you suggested before, I first tried the library
>>>>> "SNPlocs.Hsapiens.dbSNP.20090506", and got the chr position for 90% of
>>>>> my SNPs. For the remaining 10% of SNPs, I would like to get the chr
>>>>> position from the NCBI dbSNP website ( build 130, reference 36.3). I
>>>>> understand that I could use the batch query. However, I have to do
>>>>> this kind of mapping routinely for different sets of SNPs. So I am
>>>>> thinking to download those chr_rpts files for dbSNP human 36.3
>>>>> assembly to our server, then use them to do the mapping.
>>>>>
>>>>> I don't know what I've tried or will going to do is the right way to
>>>>> do. Could you give me any comments or suggestions?
>>>>>
>>>>> Thanks a lot!
>>>>> Shirley
>>>>>
>>>>> 2011/11/1 Hervé Pagès <hpages at fhcrc.org>:
>>>>>> Hi Shirley,
>>>>>>
>>>>>> On 11-11-01 01:51 PM, shirley zhang wrote:
>>>>>>>
>>>>>>> Dear list,
>>>>>>>
>>>>>>> In terms of dbSNP database in NCBI, I can get the chr_rpts files for
>>>>>>> the most recent 37.3 assembly from the following FTP site,
>>>>>>>
>>>>>>> ftp://ftp.ncbi.nih.gov/snp/organisms/human_9606/chr_rpts/
>>>>>>>
>>>>>>> My question is how/where I can get these chr_rpts files based on the
>>>>>>> 36.3 assembly
>>>>>>
>>>>>> Please don't cross post. This sounds like a question for the dbSNP
>>>>>> folks.
>>>>>>
>>>>>> FWIW, right now it doesn't seem like those files have been updated yet:
>>>>>> they are still from August 15 (i.e. dbSNP build 134, based on reference
>>>>>> genome GRCh37.p2). AFAIK the last build based of the 36.3 assembly was
>>>>>> dbSNP build 130.
>>>>>>
>>>>>> Not sure what you want to do with those files, but if you only need
>>>>>> to access the genome coordinates and alleles of your SNPs, you might
>>>>>> want to have a look at the SNPlocs.* packages.
>>>>>>
>>>>>> Alternatively, you could always use a tool like UCSC liftOver (also
>>>>>> available in Bioconductor, in the rtracklayer package) to remap things
>>>>>> between different genome assemblies.
>>>>>>
>>>>>> Cheers,
>>>>>> H.
>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Shirley
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Bioconductor mailing list
>>>>>>> Bioconductor at r-project.org
>>>>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>>>>> Search the archives:
>>>>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Hervé Pagès
>>>>>>
>>>>>> Program in Computational Biology
>>>>>> Division of Public Health Sciences
>>>>>> Fred Hutchinson Cancer Research Center
>>>>>> 1100 Fairview Ave. N, M1-B514
>>>>>> P.O. Box 19024
>>>>>> Seattle, WA 98109-1024
>>>>>>
>>>>>> E-mail: hpages at fhcrc.org
>>>>>> Phone:  (206) 667-5791
>>>>>> Fax:    (206) 667-1319
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Xiaoling (Shirley) Zhang
>>>>>
>>>>> M.D., Ph.D. (Bioinformatics)
>>>>> Boston University, Boston, MA
>>>>> Tel: (857) 233-9862
>>>>> Email: zhangxl at bu.edu
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Xiaoling (Shirley) Zhang
>>>>
>>>> M.D., Ph.D. (Bioinformatics)
>>>> Boston University, Boston, MA
>>>> Tel: (857) 233-9862
>>>> Email: zhangxl at bu.edu
>>>>
>>>> _______________________________________________
>>>> Bioconductor mailing list
>>>> Bioconductor at r-project.org
>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>>
>>>
>>
>>
>>
>> --
>> Xiaoling (Shirley) Zhang
>>
>> M.D., Ph.D. (Bioinformatics)
>> Boston University, Boston, MA
>> Tel: (857) 233-9862
>> Email: zhangxl at bu.edu
>>
>



-- 
Xiaoling (Shirley) Zhang

M.D., Ph.D. (Bioinformatics)
Boston University, Boston, MA
Tel: (857) 233-9862
Email: zhangxl at bu.edu



More information about the Bioconductor mailing list