[BioC] Quick start to linking GO terms and microarray data

Wed Mar 1 14:36:50 CET 2006

On 3/1/06 8:28 AM, "Sean Davis" <sdavis2 at mail.nih.gov> wrote:

> 
>> 
>> michael watson (IAH-C) wrote:
>> 
>>> Hi Steffen, Wolfgang
>>> 
>>> Thanks a lot, the biomaRt package looks wonderful for the species that
>>> are in ensembl... Are there any functions within it to annotate other
>>> species? (Eg bacteria, plants etc)
> 
> Mick,
> 
> This is a quick-and-dirty solution that will get you whatever NCBI has
> available for gene ontology, including arabidopsis, for example.  Hope this
> gets you another few species.  The species IDs included are:
> 
>> unique(gene2go$taxID)
>  [1]   3702   4932   6239   7227   7955   9031   9606  10090  10116  36329
> [11]  39947  83333 185431 195099 198094 211586 214684 223283 243164 243231
> [21] 243233 246200 265669 284812
> 
> Hope this helps.
> 
> Sean
> 
> 
> 
>> download.file('ftp://ftp.ncbi.nih.gov/gene/DATA/gene2go.gz',
> destfile='gene2go.gz')
> trying URL 'ftp://ftp.ncbi.nih.gov/gene/DATA/gene2go.gz'
> ftp data connection made, file length 5541317 bytes
> opened URL
> ==================================================
> downloaded 5411Kb
> 
>> gene2go <- read.table(gzfile('gene2go.gz'),sep="\t",header=FALSE,quote="")
>> colnames(gene2go) <- c('taxID', 'geneID', 'goID', 'evidence', 'qualifier',
> 'goTerm', 'pubmedlist')
>> gene2go[match(1:10,gene2go$geneID),]

This should be:

 gene2go[gene2go$geneID %in% 1:10,]

>> gene2go[match(819280,gene2go$geneID),]

And this should be:

 gene2go[gene2go$geneID %in% 1:10,]

Sorry about that.

Sean