[BioC] biomaRt: connection stopping
J.delasHeras at ed.ac.uk
J.delasHeras at ed.ac.uk
Wed Sep 13 19:40:11 CEST 2006
Quoting "James W. MacDonald" <jmacdon at med.umich.edu>:
> J.delasHeras at ed.ac.uk wrote:
>> Hi,
>>
>> I suspect this is something to do purely with my connection, but I
>> thought I'd ask, just in case:
>>
>> I have a list of refseq ids (NM_xxxxx), 18028 of them.
>> I wanted to get the gene symbols for those genes, so I used biomaRt
>> on the whole list. What I got was a single column data frame longer
>> than 18028, as I get multiple results with some of these refseq ids.
>> There doesn't seem to be an easy way to regroup them together, so I
>> do the following instead:
>
> Using the RCurl interface for a big query like that isn't ideal. You
> would be better off installing RMySQL and using the MySQL interface
> (note: you can get RMySQL using biocLite(), thanks to the fine folks in
> Seattle). Also, you can have getBM() put things in a list, so any
> duplicated gene symbols will be grouped together.
>
> A <- getBM("hgnc_symbol", "refseq_dna", RS, mart = mart, output =
> "list", mysql = TRUE)
>
> Should do the trick.
>
> HTH,
>
> Jim
ah, so simple... :-)
thanks a lot Jim, I totally overlooked the different output styles.
As for the MySQL interface... you're probably right. We have *a*
bioinformatician here and he was trying to convince me not long ago
that I should take a look at the wonders of working with MySQL...
Jose
--
Dr. Jose I. de las Heras Email: J.delasHeras at ed.ac.uk
The Wellcome Trust Centre for Cell Biology Phone: +44 (0)131 6513374
Institute for Cell & Molecular Biology Fax: +44 (0)131 6507360
Swann Building, Mayfield Road
University of Edinburgh
Edinburgh EH9 3JR
UK
More information about the Bioconductor
mailing list