[BioC] too many biomaRt connections

Elizabeth Purdom epurdom at stat.Berkeley.EDU
Wed Apr 9 21:47:27 CEST 2008


I do generally collect into one big query (for speed if nothing else). 
Except that I'll have one set of genes and then do another analysis and 
have another set, etc. so I wind up querying several times. If it were 
just that, the number of such (overall) queries is minimal. My problem 
is probably that I'm using a package (also by Steffen) and it calls 
biomart individually -- actually a couple of times per gene -- and I run 
this on several hundred genes a day sometimes. So clearly I need to 
start being more careful about this. So I guess my question more was (in 
addition to confirmation that is the problem), can anyone tell me what 
are the limits or where to find them so I can watch how many times I do 
these graph commands.
Thanks,
Elizabeth

Kasper Daniel Hansen wrote:
> Steffen will know more about this, but it is well known that when you 
> access the Mart servers that you should collect all your queries into 
> one big query, so not do something like
>   for ( g in genes)
>     getInfo(g)
> but instead do something like
>   getInfo(genes)
> 
> So try to collect everything into a few big queries, instead of doing 
> "thousands of queries" in a day.
> 
> You also have the option of downloading the entire database and access 
> it locally. That way there is no limit, but it requires some work
> 
> It is not uncommon for these large databases to have some usage limits.
> 
> Kasper
> 
> On Apr 9, 2008, at 12:09 PM, Elizabeth Purdom wrote:
> 
>> Hi,
>> I am using biomaRt to get information regarding genes. I use it pretty
>> frequently and recently have gotten the error:
>>
>> Too many connections at
>> /ebi/www/biomart/www/biomart-perl-06/lib/BioMart/Configuration/DBLocation.pm 
>>
>> line 98
>>
>> I assume that I've hit some sort of wall in terms of how often I have
>> queried the database?
>>
>> I don't really use biomart except through R; at what point do you get
>> booted off and what can I do to regain access? I often run queries on a
>> few hundred genes and don't think twice about rerunning such a query or
>> running several such queries in a day plus I use functions that call on
>> biomaRt repeatedly that I also apply to around 100 genes. So I could
>> easily send a thousand queries in a day. I can be more careful, but it
>> would be useful to know what the limits are. And does it matter how many
>> times you call a 'mart<-useMart(...)' command? (lately, I've been
>> calling it frequently rather than using the one I've already opened,
>> largely through programming laziness).
>>
>> By the way, it took me quite some time to track down the error, because
>> I was using getGene which just gave me the confusing error:
>> "Error: ncol(result) == length(attributes) is not TRUE"
>> I think this must be because something like try(...) is used within
>> getBM() and so the output is the error message which is then transferred
>> down the line and at some point causes a problem when the function tries
>> to bundle it into a data.frame, etc.
>>
>> Thanks,
>> Elizabeth
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at stat.math.ethz.ch
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives: 
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>



More information about the Bioconductor mailing list