[BioC] biomaRt Error : Affy Human Exon array annotation
James W. MacDonald
jmacdon at med.umich.edu
Tue Jul 6 15:49:25 CEST 2010
Hi Mamun,
On 7/6/2010 8:11 AM, Rashid, Mamunur wrote:
> Dear List,
>
> I have been trying to annotate some Affymetrix Human Exon Array Data
> using biomaRt. During the summarization I have played around with two
> cdf Files.
>
> 1. HuEx-1_0-st-v2,coreR3,A20071112,EP.CDF (Created by Elizabeth Purdom, UC Berkeley, created - 2007-11-12)
> unitNames = Affymetrix Transcript cluster IDs
>
> unitName groupName unit group cell
> 1 2315251 2315252 1 1 1
> 2 2315373 2315374 2 1 3
> 3 2315554 2315586 3 1 7
> ........
>
> Q1. Is there any way to convert this affymetrix transcript cluster Ids ( 2315251 ) to gene symbol, or gene Names
> using biomaRt.?? If No then what could be the possible way to do it. ??
> getBM(attributes=c("strand","transcript_start","chromosome_name",
"hgnc_symbol"),filters=c("affy_huex_1_0_st_v2"),values=c("2315252"),mart)
strand transcript_start chromosome_name hgnc_symbol
1 -1 116086 8
2 -1 86649 11 OR4F2P
3 1 180794269 5 OR4F3
4 1 180794269 5 OR4F21
5 1 367640 1 OR4F29
6 -1 621096 1 OR4F16
7 -1 105919 6
8 1 170948694 6
This of course assumes that the groupName above is actually the probeset
ID from the chip.
>
> and
>
> 2. HuEx-1_0-st-v2,U-Ensembl49,G-Affy.cdf ( Mark Robinson, Elizabeth Purdom , updated - 2008-04-01 )
> unitNames = Ensembl Gene IDs
>
> unitName groupName unit group cell
> 1 ENSG00000000003 4015402 1 1 1
> 2 ENSG00000000005 3984446 2 1 14
> ........
>
>> ensembl = useMart("ensembl")
>> ensembl = useDataset("hsapiens_gene_ensembl",mart=ensembl)
>> filters = listFilters(ensembl)
>> attributes = listAttributes(ensembl)
>
> r1_2<- getBM(attributes=c("strand","transcript_start","chromosome_name"),filters=c("ens_hs_gene"),values=c("ENSG00000000003"), mart= ensembl)
>
> I got the following error..
>
> Error in getBM(attributes = c("strand", "transcript_start", "chromosome_name"), :
> Query ERROR: caught BioMart::Exception::Database: Error during query execution: Table 'ensembl_mart_58.ox_Ens_Hs_gene__dm' doesn't exist
I think you want 'ensembl_gene_id' as a filter:
> r1_2 <-
getBM(attributes=c("strand","transcript_start","chromosome_name"),
filters=c("ensembl_gene_id"),values=c("ENSG00000000003"), mart)
> r1_2
strand transcript_start chromosome_name
1 -1 99883667 X
2 -1 99887538 X
3 -1 99888439 X
4 -1 99884691 X
Best,
Jim
>
> can any body please suggest what could be the possible reason for this.???
> Any suggestion is welcome...
>
> Thanks in advance
>
> regards,
> Mamun
>
> R version 2.11.1 (2010-05-31)
> x86_64-unknown-linux-gnu
>
> locale:
> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
> [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8
> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
> [9] LC_ADDRESS=C LC_TELEPHONE=C
> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] stats graphics grDevices utils datasets methods base
>
> other attached packages:
> [1] biomaRt_2.4.0
>
> loaded via a namespace (and not attached):
> [1] RCurl_1.4-2 XML_3.1-0
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
--
James W. MacDonald, M.S.
Biostatistician
Douglas Lab
University of Michigan
Department of Human Genetics
5912 Buhl
1241 E. Catherine St.
Ann Arbor MI 48109-5618
734-615-7826
**********************************************************
Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues
More information about the Bioconductor
mailing list