[BioC] Query ERROR: caught BioMart::Exception

Fri May 14 15:20:14 CEST 2010

Hi Maura,

mauede at alice.it wrote:
> Last week I extracted the 3UTR sequences for a list of genes identified through their Entrez_num.
> The biologist we are working with is not happy yet ... and asked me to provide the following:
> 
> Regarding the 3'UTR database I need for now the following:
>  
> 1) UTR sequence
> 2) name of the gene
> 3) where the 3'UTR  is located in the gene sequence. In other terms the coordinates of where 3'UTR starts and ends.
> Thanks
> 
> I thought it would be an easy task but I incurred in the following error:
>> genes_map <- getBM(attributes=c("entrezgene","hgnc_symbol","ensembl_gene_id","ensembl_transcript_id","3_utr_start","3_utr_end"),
> + filters = "entrezgene", values=genes.ds[1:3,1], mart=hmart)
> Error in getBM(attributes = c("entrezgene", "hgnc_symbol", "ensembl_gene_id",  : 
>   Query ERROR: caught BioMart::Exception::Usage: Attributes from multiple attribute pages are not allowed

This has to do with the way the Biomart webserver/database is set up. 
Essentially, you can't do gene-based queries and transcript-based 
queries at one time. But you can do them sequentially and merge at the end.

 > mart <- useMart("ensembl","hsapiens_gene_ensembl")
Checking attributes ... ok
Checking filters ... ok
 > suppressMessages(library(org.Hs.eg.db))
 > egs <- head(Lkeys(org.Hs.egSYMBOL)) ## just getting some IDs
 > egs
[1] "1"         "10"        "100"       "1000"      "10000"     "100008586"
 > a <- getBM(c("entrezgene","hgnc_symbol","ensembl_gene_id", 
"ensembl_transcript_id"), "entrezgene", egs, mart)
 > head(a)
   entrezgene hgnc_symbol ensembl_gene_id ensembl_transcript_id
1          1        A1BG ENSG00000121410       ENST00000263100
2         10        NAT2 ENSG00000156006       ENST00000286479
3        100         ADA ENSG00000196839       ENST00000372874
4       1000        CDH2 ENSG00000170558       ENST00000269141
5      10000        AKT3 ENSG00000117020       ENST00000366539
6      10000        AKT3 ENSG00000117020       ENST00000366540
 > b <- getBM(c("ensembl_transcript_id", "3_utr_start","3_utr_end"), 
"ensembl_transcript_id", as.character(a[,4]), mart)
 > head(b)
   ensembl_transcript_id 3_utr_start 3_utr_end
1       ENST00000263100          NA        NA
2       ENST00000263100    58858172  58858387
3       ENST00000263826          NA        NA
4       ENST00000263826   243666484 243668550
5       ENST00000269141          NA        NA
6       ENST00000269141    25530930  25532116
 > d <- merge(a,b)
 > head(d)
   ensembl_transcript_id entrezgene hgnc_symbol ensembl_gene_id 3_utr_start
1       ENST00000263100          1        A1BG ENSG00000121410          NA
2       ENST00000263100          1        A1BG ENSG00000121410    58858172
3       ENST00000263826      10000        AKT3 ENSG00000117020          NA
4       ENST00000263826      10000        AKT3 ENSG00000117020   243666484
5       ENST00000269141       1000        CDH2 ENSG00000170558    25530930
6       ENST00000269141       1000        CDH2 ENSG00000170558          NA
   3_utr_end
1        NA
2  58858387
3        NA
4 243668550
5  25532116
6        NA

Best,

Jim

> 
> I cannot see what I am doing wrong.
> Your help is very welcome.
> 
> Thank you in advance.
> Maura
> 
> 
> tutti i telefonini TIM!
> 
> 
> 	[[alternative HTML version deleted]]
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

-- 
James W. MacDonald, M.S.
Biostatistician
Douglas Lab
University of Michigan
Department of Human Genetics
5912 Buhl
1241 E. Catherine St.
Ann Arbor MI 48109-5618
734-615-7826
**********************************************************
Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues