[BioC] biomaRt: connection stopping
James W. MacDonald
jmacdon at med.umich.edu
Wed Sep 13 17:44:00 CEST 2006
J.delasHeras at ed.ac.uk wrote:
> Hi,
>
> I suspect this is something to do purely with my connection, but I
> thought I'd ask, just in case:
>
> I have a list of refseq ids (NM_xxxxx), 18028 of them.
> I wanted to get the gene symbols for those genes, so I used biomaRt on
> the whole list. What I got was a single column data frame longer than
> 18028, as I get multiple results with some of these refseq ids. There
> doesn't seem to be an easy way to regroup them together, so I do the
> following instead:
Using the RCurl interface for a big query like that isn't ideal. You
would be better off installing RMySQL and using the MySQL interface
(note: you can get RMySQL using biocLite(), thanks to the fine folks in
Seattle). Also, you can have getBM() put things in a list, so any
duplicated gene symbols will be grouped together.
A <- getBM("hgnc_symbol", "refseq_dna", RS, mart = mart, output =
"list", mysql = TRUE)
Should do the trick.
HTH,
Jim
>
> #create an empty list of teh right length
> A<-vector(mode="list", length=18028)
> #now loop filling elements of the list from the biomaRt queries
> for (i in 1:18028){
> K<-i
> A[[i]]<-getBM(attributes=c("hgnc_symbol"),mart=mart,filters="refseq_dna",values=c(RS[i]))
> }
> print(K)
>
> RS is a vector containing the 18028 refseq ids.
> the K value is only so that I know where it breaks... because that's
> what happens... after a while, it breaks with an error message:
>
> Error in postForm(paste(mart at host, "?", sep = ""), query = xmlQuery) :
> couldn't connect to host
>
> This doesn't happen if I send the whole query in ONE go, in a vector...
> but if I do it element by element it breaks after 3-4000 queries.
> Any ideas to do this in a simpler/better way? Or at least one that
> doesn't have me coming back to re-start the loop at the position of the
> last break?
>
> thanks!
>
> Jose
>
--
James W. MacDonald, M.S.
Biostatistician
Affymetrix and cDNA Microarray Core
University of Michigan Cancer Center
1500 E. Medical Center Drive
7410 CCGC
Ann Arbor MI 48109
734-647-5623
**********************************************************
Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues.
More information about the Bioconductor
mailing list