[BioC] biomaRt: connection stopping

J.delasHeras at ed.ac.uk J.delasHeras at ed.ac.uk
Wed Sep 13 19:40:11 CEST 2006


Quoting "James W. MacDonald" <jmacdon at med.umich.edu>:

> J.delasHeras at ed.ac.uk wrote:
>> Hi,
>>
>> I suspect this is something to do purely with my connection, but I 
>> thought I'd ask, just in case:
>>
>> I have a list of refseq ids (NM_xxxxx), 18028 of them.
>> I wanted to get the gene symbols for those genes, so I used biomaRt 
>> on the whole list. What I got was a single column data frame longer 
>> than 18028, as I get multiple results with some of these refseq ids. 
>> There doesn't seem to be an easy way to regroup them together, so I 
>> do the following instead:
>
> Using the RCurl interface for a big query like that isn't ideal. You
> would be better off installing RMySQL and using the MySQL interface
> (note: you can get RMySQL using biocLite(), thanks to the fine folks in
> Seattle). Also, you can have getBM() put things in a list, so any
> duplicated gene symbols will be grouped together.
>
> A <- getBM("hgnc_symbol", "refseq_dna", RS, mart = mart, output =
> "list", mysql = TRUE)
>
> Should do the trick.
>
> HTH,
>
> Jim

ah, so simple... :-)

thanks a lot Jim, I totally overlooked the different output styles.

As for the MySQL interface... you're probably right. We have *a* 
bioinformatician here and he was trying to convince me not long ago 
that I should take a look at the wonders of working with MySQL...

Jose

-- 
Dr. Jose I. de las Heras                      Email: J.delasHeras at ed.ac.uk
The Wellcome Trust Centre for Cell Biology    Phone: +44 (0)131 6513374
Institute for Cell & Molecular Biology        Fax:   +44 (0)131 6507360
Swann Building, Mayfield Road
University of Edinburgh
Edinburgh EH9 3JR
UK



More information about the Bioconductor mailing list