[BioC] Gostats with Yeast annotation

Alex Gutteridge alexg at ruggedtextile.com
Fri Jul 25 10:35:30 CEST 2008


Hi,

Just to confirm the org.Sc.sgd.db package and GOstats seem to work  
fine together for me in Bioc-devel (Sample session pasted below).

R version 2.8.0 Under development (unstable) (2008-07-22 r46103)
Copyright (C) 2008 The R Foundation for Statistical Computing
ISBN 3-900051-07-0
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
  Natural language support but running in an English locale
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
[Previously saved workspace restored]
 > library(Category)
Loading required package: Biobase
Loading required package: tools
Welcome to Bioconductor
  Vignettes contain introductory material. To view, type
  'openVignette()'. To cite Bioconductor, see
  'citation("Biobase")' and for packages 'citation(pkgname)'.
Loading required package: graph
Loading required package: annotate
Loading required package: AnnotationDbi
Loading required package: DBI
Loading required package: RSQLite
Loading required package: xtable
Loading required package: genefilter
Loading required package: survival
Loading required package: splines
 > library(GOstats)
Loading required package: GO.db
Loading required package: RBGL
 > sel = readLines("Turbidostat.genes")
 > uni = readLines("all.genes")
 > params =  
new 
("GOHyperGParams 
",geneIds 
= 
sel 
,universeGeneIds 
= 
uni 
,annotation 
= 
"org 
.Sc 
.sgd 
.db 
",ontology="BP",pvalueCutoff=0.1,conditional=FALSE,testDirection="over")
 > over = hyperGTest(params)
 > summary(over)
               GOBPID       Pvalue OddsRatio    ExpCount Count Size
GO:0006412 GO:0006412 1.109223e-16  2.755286  62.1352567   125  383
GO:0010467 GO:0010467 4.482367e-14  1.824627 225.8284001   317 1392
GO:0009059 GO:0009059 8.256804e-14  2.232463  89.0659423   154  549
GO:0043170 GO:0043170 8.691113e-13  1.698656 414.6676658   510 2556
GO:0044267 GO:0044267 5.484817e-12  1.746362 214.3098539   296 1321
GO:0019538 GO:0019538 1.268805e-11  1.718287 224.6927688   306 1385
[..snip..]
 > q()
Save workspace image? [y/n/c]: n
ag357 at ag357-pc2102:~/Desktop/study> head Turbidostat.genes
YAL001C
YAL002W
YAL003W
YAL005C
YAL008W
YAL009W
YAL010C
YAL011W
YAL019W
ag357 at ag357-pc2102:~/Desktop/study> head all.genes
YHR047C
YHR051W
YHR066W
YHR068W
YHR075C
YHR076W
YHR080C
YHR083W
YHR143W-A
YKL137W

AlexG

On 22 Jul 2008, at 18:07, Robert Gentleman wrote:

> Hi Alex,
>  If you are willing to use R-devel and Bioc-devel, the issue should  
> be fixed there.  I would be interested in hearing of any problems  
> you might have (or successes) using that version.  I am waiting for  
> some reports of success before I port this to release,
>
> best wishes
>  Robert
>
>
> Alex Gutteridge wrote:
>> Hi,
>> I've been trying to use the hyperGTest method from the GOstats  
>> package with some yeast ORF data. I notice in this thread from a  
>> month or so ago that there are problems at the moment with using  
>> any of the yeast annotation sets apart from 'YEAST' (which is  
>> deprecated) due to missing ID2EntrezID methods:
>> https://stat.ethz.ch/pipermail/bioconductor/2008-June/022697.html
>> I just wanted to make sure that this was still the case and I guess  
>> fish around for an estimated ETA for when the org.Sc.sgd.db  
>> annotations (which are replacing YEAST as I understand it) will be  
>> compatible with hyperGTest?
>> Also, is the exact source of GO annotations used in these packages  
>> documented anywhere? Looking in the DESCRIPTION file I see  
>> 'primarily based on mapping using ORF identifiers from SGD' for  
>> org.Sc.sgd.db and 'assembled using data from public data  
>> repositories' for YEAST. Should I just take it these are based on  
>> the SGD GO annotation file from the date given in the Packaged  
>> field of the DESCRIPTION file? For YEAST there is
>
> the man page is pretty explicit, (?org.Sc.sgdGO)
>
>     Mappings were based on data provided by: Yeast Genome (
>     ftp://genome-ftp.stanford.edu/pub/yeast/data_download ) on
>     2008-Mar29
>
> I am not sure what more we could put there.
>
> best wishes
>  Robert
>
>> also a Created field which is aprox. 1 month prior to the Packaged  
>> date so I'm guessing the real age of the data is that one? The  
>> yeast annotations change so quickly it's useful to be able to pin  
>> this down as accurately as possible.
>> Thanks in advance for any help with these questions.
>> Alex Gutteridge
>> Department of Biochemistry
>> University of Cambridge
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at stat.math.ethz.ch
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>
> -- 
> Robert Gentleman, PhD
> Program in Computational Biology
> Division of Public Health Sciences
> Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N, M2-B876
> PO Box 19024
> Seattle, Washington 98109-1024
> 206-667-7700
> rgentlem at fhcrc.org
>

Alex Gutteridge

Department of Biochemistry
University of Cambridge



More information about the Bioconductor mailing list