[BioC] Query ERROR: caught BioMart::Exception
James W. MacDonald
jmacdon at med.umich.edu
Fri May 14 15:20:14 CEST 2010
Hi Maura,
mauede at alice.it wrote:
> Last week I extracted the 3UTR sequences for a list of genes identified through their Entrez_num.
> The biologist we are working with is not happy yet ... and asked me to provide the following:
>
> Regarding the 3'UTR database I need for now the following:
>
> 1) UTR sequence
> 2) name of the gene
> 3) where the 3'UTR is located in the gene sequence. In other terms the coordinates of where 3'UTR starts and ends.
> Thanks
>
> I thought it would be an easy task but I incurred in the following error:
>> genes_map <- getBM(attributes=c("entrezgene","hgnc_symbol","ensembl_gene_id","ensembl_transcript_id","3_utr_start","3_utr_end"),
> + filters = "entrezgene", values=genes.ds[1:3,1], mart=hmart)
> Error in getBM(attributes = c("entrezgene", "hgnc_symbol", "ensembl_gene_id", :
> Query ERROR: caught BioMart::Exception::Usage: Attributes from multiple attribute pages are not allowed
This has to do with the way the Biomart webserver/database is set up.
Essentially, you can't do gene-based queries and transcript-based
queries at one time. But you can do them sequentially and merge at the end.
> mart <- useMart("ensembl","hsapiens_gene_ensembl")
Checking attributes ... ok
Checking filters ... ok
> suppressMessages(library(org.Hs.eg.db))
> egs <- head(Lkeys(org.Hs.egSYMBOL)) ## just getting some IDs
> egs
[1] "1" "10" "100" "1000" "10000" "100008586"
> a <- getBM(c("entrezgene","hgnc_symbol","ensembl_gene_id",
"ensembl_transcript_id"), "entrezgene", egs, mart)
> head(a)
entrezgene hgnc_symbol ensembl_gene_id ensembl_transcript_id
1 1 A1BG ENSG00000121410 ENST00000263100
2 10 NAT2 ENSG00000156006 ENST00000286479
3 100 ADA ENSG00000196839 ENST00000372874
4 1000 CDH2 ENSG00000170558 ENST00000269141
5 10000 AKT3 ENSG00000117020 ENST00000366539
6 10000 AKT3 ENSG00000117020 ENST00000366540
> b <- getBM(c("ensembl_transcript_id", "3_utr_start","3_utr_end"),
"ensembl_transcript_id", as.character(a[,4]), mart)
> head(b)
ensembl_transcript_id 3_utr_start 3_utr_end
1 ENST00000263100 NA NA
2 ENST00000263100 58858172 58858387
3 ENST00000263826 NA NA
4 ENST00000263826 243666484 243668550
5 ENST00000269141 NA NA
6 ENST00000269141 25530930 25532116
> d <- merge(a,b)
> head(d)
ensembl_transcript_id entrezgene hgnc_symbol ensembl_gene_id 3_utr_start
1 ENST00000263100 1 A1BG ENSG00000121410 NA
2 ENST00000263100 1 A1BG ENSG00000121410 58858172
3 ENST00000263826 10000 AKT3 ENSG00000117020 NA
4 ENST00000263826 10000 AKT3 ENSG00000117020 243666484
5 ENST00000269141 1000 CDH2 ENSG00000170558 25530930
6 ENST00000269141 1000 CDH2 ENSG00000170558 NA
3_utr_end
1 NA
2 58858387
3 NA
4 243668550
5 25532116
6 NA
Best,
Jim
>
> I cannot see what I am doing wrong.
> Your help is very welcome.
>
> Thank you in advance.
> Maura
>
>
> tutti i telefonini TIM!
>
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
--
James W. MacDonald, M.S.
Biostatistician
Douglas Lab
University of Michigan
Department of Human Genetics
5912 Buhl
1241 E. Catherine St.
Ann Arbor MI 48109-5618
734-615-7826
**********************************************************
Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues
More information about the Bioconductor
mailing list