[BioC] GO annotation/analysis for ath1121501

Nianhua Li nli at fhcrc.org
Fri Aug 25 21:30:19 CEST 2006


Ann Hess <hess at ...> writes:

> 
> I am working with data from ath1121501 (arabidopsis) arrays and I would 
> like to do the following:
> 
> 1.  Subset a list of genes based on GO terms.  For example, how many 
> (and which) a given list belong to MF=metabolism.

## find the GO Identifier for "metabolism"
library(GO)
myGoTerm <- "metabolism"
myGoID <- unlist(eapply(GOTERM, function(g) if (g at Term == myGoTerm) TRUE else
FALSE))
myGoID <- names(myGoID[myGoID])
print(myGoID)

## or if you want to find the GO term containing "metabolism"
##x <- eapply(GOTERM, function(g) if (length(grep("metabolism", g at Term))>0)
cat(g at GOID, " ", g at Term, "\n"))

## get probeset IDs associated with myGoID
library(ath1121501)
myProbeID <- get(myGoID, ath1121501GO2RROBE)
myAllProbeID <- get(myGoID, ath1121501GO2ALLPROBES)
?ath1121501GO2RROBE
?ath1121501GO2ALLPROBES

> 2.  Create a pie chart of the distribution of GO terms for my list.
> 3.  Find statistically over-represented GO terms.

library(Category)
library(ath1121501)
library(GO)
set.seed(123)
probes <- ls(ath1121501ACCNUM)
probes <- sample(probes, 100)
locusList <- unique(unlist(mget(probes, ath1121501ACCNUM)))
ath1121501LOCUSID <- ath1121501ACCNUM
ans <- geneGoHyperGeoTest(locusList, "ath1121501", "BP")
?geneGoHyperGeoTest
class?GeneGoHyperGeoTestResult

> 4.  Find pathway information for my list.

probe-to-AraCyc mapping in ath1121501PATH
probe-to-gene mapping in ath1121501ACCNUM

If you want pathway information from KEGG, use AnnBuilder 1.11.8 to build your
own ath1121501, and check environment ath1121501PATH and ath1121501ARACYC.

> 
> In an attempt to accomplish goal #3, I tried using the GoHyperG function 
> from the GOstats package, but the locus link ID information does not 
> appear to be available for ath1121501 (this has been addressed in previous 
> postings).  Are there alternatives that can be used for ath1121501?
> 
For Arabidopsis annotation packages, AGI locus identifier is used to retrieve
annotations for gene, i.e. Entrez Gene ID or GenBank accession are not used.
Therefore, there is no xxxxLOCUSID environment. xxxxACCNUM gives probe-to AGI
locus mapping. 

hope it helps

nianhua



More information about the Bioconductor mailing list