[BioC] question about ontoCompare() performance change

Seth Falcon sfalcon at fhcrc.org
Thu Oct 29 18:26:12 CET 2009


Hi Scott,

Thanks for the reminder and providing a reproducible example.  We will 
take a look and see if we can understand and provide a fix for the slow 
down.

+ seth

On 10/28/09 5:23 PM, Scott Markel wrote:
> Just a quick FYI to anyone else using goTools' ontoCompare().
>
> It looks like it's approximately another factor of 2 slower in
> BioConductor 2.5.  User time has gone from 25 seconds (2.3) to
> 150 seconds (2.4) to 290 seconds (2.5).  Don't know if this is
> package-specific or caused by changes in R.
>
> Scott
>
>
> -----Original Message-----
> From: Scott Markel
> Sent: Wednesday, 10 June 2009 5:15 PM
> To: Bioconductor at stat.math.ethz.ch
> Subject: question about ontoCompare() performance change
>
> I'm seeing a noticeable performance change in goTools' ontoCompare()
> from BioConductor version 2.3 to 2.4.  With the same input data the
> user time reported by system.time() on my Windows XP machine has gone
> from 25 seconds to about 150 seconds.  Times on a RHEL 5 machine are
> 30 seconds and 130 seconds.
>
> I checked the ontoCompare() help, the goTools documentation, the mailing
> list archives, and Google for terms like "ontoCompare goTools performance",
> and didn't find anything.
>
> I'm sure I'm missing something obvious, but I'd appreciate advice on
> how I should now be using ontoCompare() in Bioc 2.4.
>
> The script, BioC 2.3 output, BioC 2.4 output, and two sets of
> sessionInfo() follow.
>
> Scott
>
> ##############################
> Here's the R script, using the same inputs for both BioC 2.3 and 2.4.
>
> prop<-list()
> prop$probeIDs<- c("1007_s_at", "1053_at", "117_at", "121_at",
> "1255_g_at", "1294_at", "1316_at", "1320_at", "1405_i_at", "1405_i_at")
> prop$microarrayType<- "hgu133a"
>
> library("goTools")
> library("hgu133a.db")
>
> system.time(result<- ontoCompare( list(prop$probeIDs),
> probeType=as.character(prop$microarrayType), method="none", goType="MF"))
> ##############################
> The BioC 2.3 output is
>
>     user  system elapsed
>    23.31    0.22   25.70
>
>> result
>    binding catalytic activity chemoattractant activity enzyme regulator
> activity
> 1      10                  4                        2
> 1
>    molecular transducer activity structural molecule activity
> 1                             5                            1
>    transcription regulator activity NotFound
> 1                                2        0
> ##############################
> The BioC 2.4 output is
>
>     user  system elapsed
>   151.16    0.41  169.11
>
>> result
>                                   [,1]
> catalytic activity                  4
> binding                            10
> enzyme regulator activity           1
> transcription regulator activity    2
> chemoattractant activity            2
> molecular transducer activity       5
>
> ##############################
>> sessionInfo()
> R version 2.7.2 (2008-08-25)
> i386-pc-mingw32
>
> locale:
> LC_COLLATE=English_United States.1252;LC_CTYPE=English_United
> States.1252;LC_MONETARY=English_United
> States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252
>
> attached base packages:
> [1] tools     stats     graphics  grDevices utils     datasets  methods
> [8] base
>
> other attached packages:
>   [1] hgu133a_2.2.0       hgu133a.db_2.2.0    goTools_1.12.0
>   [4] GO_2.2.0            annotate_1.18.0     xtable_1.5-4
>   [7] AnnotationDbi_1.2.2 RSQLite_0.7-0       DBI_0.2-4
> [10] Biobase_2.0.1
> ##############################
>> sessionInfo()
> R version 2.9.0 (2009-04-17)
> i386-pc-mingw32
>
> locale:
> LC_COLLATE=English_United States.1252;LC_CTYPE=English_United
> States.1252;LC_MONETARY=English_United
> States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252
>
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
>
> other attached packages:
> [1] hgu133a.db_2.2.11   goTools_1.18.0      GO.db_2.2.11
> [4] RSQLite_0.7-1       DBI_0.2-4           AnnotationDbi_1.6.0
> [7] Biobase_2.4.1
> ##############################
>
> Scott Markel, Ph.D.
> Principal Bioinformatics Architect  email:  smarkel at accelrys.com
> Accelrys (SciTegic R&D)             mobile: +1 858 205 3653
> 10188 Telesis Court, Suite 100      voice:  +1 858 799 5603
> San Diego, CA 92121                 fax:    +1 858 799 5222
> USA                                 web:    http://www.accelrys.com
>
> http://www.linkedin.com/in/smarkel
> Vice President, Board of Directors:
>      International Society for Computational Biology
> Co-chair: ISCB Publications Committee
> Associate Editor: PLoS Computational Biology
> Editorial Board: Briefings in Bioinformatics
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor



More information about the Bioconductor mailing list