[BioC] biomaRt: connection stopping
J.delasHeras at ed.ac.uk
J.delasHeras at ed.ac.uk
Wed Sep 13 17:20:50 CEST 2006
Hi,
I suspect this is something to do purely with my connection, but I
thought I'd ask, just in case:
I have a list of refseq ids (NM_xxxxx), 18028 of them.
I wanted to get the gene symbols for those genes, so I used biomaRt on
the whole list. What I got was a single column data frame longer than
18028, as I get multiple results with some of these refseq ids. There
doesn't seem to be an easy way to regroup them together, so I do the
following instead:
#create an empty list of teh right length
A<-vector(mode="list", length=18028)
#now loop filling elements of the list from the biomaRt queries
for (i in 1:18028){
K<-i
A[[i]]<-getBM(attributes=c("hgnc_symbol"),mart=mart,filters="refseq_dna",values=c(RS[i]))
}
print(K)
RS is a vector containing the 18028 refseq ids.
the K value is only so that I know where it breaks... because that's
what happens... after a while, it breaks with an error message:
Error in postForm(paste(mart at host, "?", sep = ""), query = xmlQuery) :
couldn't connect to host
This doesn't happen if I send the whole query in ONE go, in a vector...
but if I do it element by element it breaks after 3-4000 queries.
Any ideas to do this in a simpler/better way? Or at least one that
doesn't have me coming back to re-start the loop at the position of the
last break?
thanks!
Jose
--
Dr. Jose I. de las Heras Email: J.delasHeras at ed.ac.uk
The Wellcome Trust Centre for Cell Biology Phone: +44 (0)131 6513374
Institute for Cell & Molecular Biology Fax: +44 (0)131 6507360
Swann Building, Mayfield Road
University of Edinburgh
Edinburgh EH9 3JR
UK
More information about the Bioconductor
mailing list