[BioC] Anyone have a GO slim list for Affy HG-U133A or HG-U133Av2?

Ken Termiso jerk_alert at hotmail.com
Thu Feb 24 19:42:41 CET 2005


Hi Sean,

Maybe from now on I'll just email you directly instead of the mailing list 
:)

Thanks for your reply...I did something similar to get what I wanted and I 
think it's pretty simple (provided you're using the affy annotation file):

Say I wanted to get a list of all genes related to cell death:

# "ann" is a data frame containing the affy annotation file 
HG-U133A_2_annot_csv.zip off their website

i <- grep("death", as.vector(ann$Gene.Ontology.Biological.Process))
j <- grep("apoptosis", as.vector(ann$Gene.Ontology.Biological.Process))
k <- union(i,j)
k <- sort(k) #optional, but the union arg returns unsorted
te <- data.frame()
te <- ann[k,]

Now "te" is a data frame that contains the genes that have either "death" or 
"apoptosis" mentioned in GO Biol. Proc.

I see there in your reply that you are using the annotate library -- the 
only annotation I've used is the affy file, which contains a lot of fields 
-- do you recommend using the annotate library over this? I've been using 
the affy file b/c it's a simple .csv file and thus is pretty straightforward 
to work with in R.


Thanks again,
Ken


>From: Sean Davis <sdavis2 at mail.nih.gov>
>To: "Ken Termiso" <jerk_alert at hotmail.com>
>CC: bioconductor at stat.math.ethz.ch
>Subject: Re: [BioC] Anyone have a GO slim list for Affy HG-U133A or 
>HG-U133Av2? Date: Thu, 24 Feb 2005 12:47:57 -0500
>
>Ken,
>
>You could certainly produce such a list by repeating what they have done on 
>that website.  For example, for GO_slim, biologic process 3 (cell cycle and 
>proliferation):
>
>library(hgu133a)
>library(annotate)
>genes <- 
>unlist(lookUp(c('GO:0007049','GO:0008283'),'hgu133a','GO2ALLPROBES'))
>genes <- genes[!duplicated(genes)]
>
>This will contain the genes in cell cycle and proliferation.  It wouldn't 
>be hard to automate this process for each category.  For those categories 
>that include EXCLUDE descriptions, you can use R set commands like %in% to 
>get the sets you want.
>
>Sean
>
>On Feb 24, 2005, at 11:44 AM, Ken Termiso wrote:
>
>>Hi,
>>
>>I'm using the affy HG-U133A_2_annot_csv.zip annotation file to annotate my 
>>data (which may be a bad idea to begin with..?), and would like to be able 
>>to use the GO slim categories to annotate my data (see 
>>http://www.spatial.maine.edu/~mdolan/MGI_GO_Slim.html), instead of the 
>>extremely detailed GO categories already present in the affymetrix file - 
>>Gene.Ontology.Biological.Process, Gene.Ontology.Cellular.Component, 
>>Gene.Ontology.Molecular.Function.
>>
>>Basically, the issue is that I don't want to have the 2,000 - 5,000 
>>different annotation groups in my file. I want to be able to run subset() 
>>on very general groups, like "development" or "death".
>>
>>Thanks in advance,
>>Ken
>>
>>_______________________________________________
>>Bioconductor mailing list
>>Bioconductor at stat.math.ethz.ch
>>https://stat.ethz.ch/mailman/listinfo/bioconductor
>



More information about the Bioconductor mailing list