[BioC] biomaRt:getBM error when query is large
Shi, Tao
shidaxia at yahoo.com
Sat Aug 2 00:20:23 CEST 2008
Thanks, Steffen.
That was exactly what I did. I was doing 10000 at a time, just to be safe.
...Tao
----- Original Message ----
From: "steffen at stat.Berkeley.EDU" <steffen at stat.Berkeley.EDU>
To: "Shi, Tao" <shidaxia at yahoo.com>
Cc: bioconductor at stat.math.ethz.ch
Sent: Friday, August 1, 2008 3:09:40 PM
Subject: Re: [BioC] biomaRt:getBM error when query is large
Hi Tao,
I haven't hit a limit yet but you might have. 430.000 ids is quite large.
Try to split your query in a few batches of e.g. 100.000 or 50.000 long
(you should not need to go below this length).
I would also put
Sys.sleep(1)
between each query so you won't get into trouble of sending a subsequent
querying the server to fast after an earlier query.
I bet:
tmp1 <- getBM(c("ensembl_gene_stable_id", "refsnp_id",
"allele","chr_name", "chrom_start", "chrom_strand"),filters = "refsnp",
values = rs[1:100000], mart = mart)
Sys.sleep(1)
tmp2 <- getBM(c("ensembl_gene_stable_id", "refsnp_id",
"allele","chr_name", "chrom_start", "chrom_strand"),filters = "refsnp",
values = rs[100000:200000], mart = mart)
Sys.sleep(1)
tmp3 <- getBM(c("ensembl_gene_stable_id", "refsnp_id",
"allele","chr_name", "chrom_start", "chrom_strand"),filters = "refsnp",
values = rs[200000:300000], mart = mart)
Sys.sleep(1)
tmp4 <- getBM(c("ensembl_gene_stable_id", "refsnp_id",
"allele","chr_name", "chrom_start", "chrom_strand"),filters = "refsnp",
values = rs[300000:430000], mart = mart)
all = rbind(tmp1,tmp2,tmp3,tmp4)
Should do it.
Cheers,
Steffen
> Hi list,
>
> See the sample codes below, where "rs" is a char vector containing ~430000
> rs IDs. However, when I ran the query 10000 at a time, it worked. Is
> there a query limit for biomaRt?
>
> Thanks,
>
> ...Tao
>
>
>
>> tmp <- getBM(c("ensembl_gene_stable_id", "refsnp_id", "allele",
>> "chr_name", "chrom_start", "chrom_strand"),
> + filters = "refsnp", values = rs, mart = mart)
> Error in postForm(paste(martHost(mart), "?", sep = ""), query = xmlQuery)
> :
> Empty reply from server
>
>> sessionInfo()
> R version 2.7.0 (2008-04-22)
> i386-pc-mingw32
>
> locale:
> LC_COLLATE=English_United States.1252;LC_CTYPE=English_United
> States.1252;LC_MONETARY=English_United
> States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252
>
> attached base packages:
> [1] tools stats graphics grDevices utils datasets methods
> base
>
> other attached packages:
> [1] biomaRt_1.14.0 RCurl_0.9-3 GO.db_2.2.0
> AnnotationDbi_1.2.2 RSQLite_0.6-9 DBI_0.2-4 Biobase_2.0.1
>
> loaded via a namespace (and not attached):
> [1] XML_1.95-2
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
>
More information about the Bioconductor
mailing list