[BioC] R crashes with GEOmetadb
Hooiveld, Guido
Guido.Hooiveld at wur.nl
Thu Jun 30 14:50:06 CEST 2011
Hi Sean,
Indeed, you are correct!
Due to my inexperience with performing database queries, and clumsy interpretation of some example code I inadvertently closed the connection to the database... Well, after omitting this line the example is working fine now! :)
One thing though, through GEOmetadb I locate 17751 CEL files for GPL96, whereas a query directly @ GEO indicates it hosts a considerably larger number of these arrays (i.e. Samples (28011)). Any idea what may cause this discrepancy?
http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GPL96
Thanks again for your assistance,
Guido
-----Original Message-----
From: seandavi at gmail.com [mailto:seandavi at gmail.com] On Behalf Of Sean Davis
Sent: Thursday, June 30, 2011 14:03
To: Hooiveld, Guido
Cc: bioconductor (bioconductor at stat.math.ethz.ch); Seth Falcon
Subject: Re: [BioC] R crashes with GEOmetadb
See below.
On Wed, Jun 29, 2011 at 11:36 AM, Hooiveld, Guido <Guido.Hooiveld at wur.nl> wrote:
> Dear Sean and others,
>
> I am exploring the functionality of 'GEOmetadb'. I am specifically interested in downloading all CEL files performed on a certain platform.
> To this end I am using the example mentioned in the vignette of GEOmetadb, which should retrieve the number of GEO entries and CEL files performed on the Affymetrix array HGU133A (page 8 vignette).
> However, when executing that code R crashes and needs to exit...
> To me the error messages are not informative to me, but may be you can deduce what is going wrong. Any feedback is appreciated.
>
> Regards,
> Guido
>
>
> R version 2.13.0 (2011-04-13)
> Copyright (C) 2011 The R Foundation for Statistical Computing ISBN
> 3-900051-07-0
> Platform: x86_64-unknown-linux-gnu (64-bit)
>
>>
>> library(GEOmetadb)
> Loading required package: GEOquery
> Loading required package: Biobase
>
> Welcome to Bioconductor
>
> Vignettes contain introductory material. To view, type
> 'browseVignettes()'. To cite Bioconductor, see
> 'citation("Biobase")' and for packages 'citation("pkgname")'.
>
> Setting options('download.file.method.GEOquery'='curl')
> Loading required package: RSQLite
> Loading required package: DBI
>> getSQLiteFile()
> trying URL 'http://gbnci.abcc.ncifcrf.gov/geo/GEOmetadb.sqlite.gz'
> Content type 'text/plain; charset=ISO-8859-1' length 109446149 bytes
> (104.4 Mb) opened URL ================================================
> downloaded 104.4 Mb
>
> Unzipping...
> Metadata associate with downloaded file:
> name value
> 1 schema version 1.0
> 2 creation timestamp 2011-06-18 09:50:00 [1]
> "/home.local/guidoh/GEOmetadb.sqlite"
>>
>> con <- dbConnect(SQLite(), "GEOmetadb.sqlite")
>> dbDisconnect(con)
Sorry, Guido. I missed this point in my first pass through your email. Here, you disconnect the connection.
> [1] TRUE
>>
>> rs <- dbGetQuery(con,paste("select gsm,supplementary_file",
> + "from gsm where gpl='GPL96'",
> + "and supplementary_file like '%CEL.gz'"))
Here, you are using a disconnected connection object (con) to perform the query; it should fail with an error message but probably not a segmentation fault. If you DO NOT disconnect the connection object, this query works fine. Perhaps RSQLite should have a check of the connection object to make sure that it is connected to avoid the segmentation fault?
Sean
> sessionInfo()
R version 2.13.0 Under development (unstable) (2011-02-26 r54608)
Platform: x86_64-unknown-linux-gnu (64-bit)
locale:
[1] C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] RSQLite_0.9-4 DBI_0.2-5
> *** caught segfault ***
> address 0x8, cause 'memory not mapped'
>
> Traceback:
> 1: .Call("RS_SQLite_exec", conId, statement, bind.data, PACKAGE =
> .SQLitePkgName)
> 2: sqliteExecStatement(con, statement, bind.data)
> 3: sqliteQuickSQL(conn, statement, ...)
> 4: dbGetQuery(con, paste("select gsm,supplementary_file", "from gsm
> where gpl='GPL96'", "and supplementary_file like '%CEL.gz'"))
> 5: dbGetQuery(con, paste("select gsm,supplementary_file", "from gsm
> where gpl='GPL96'", "and supplementary_file like '%CEL.gz'"))
>
> Possible actions:
> 1: abort (with core dump, if enabled)
> 2: normal R exit
> 3: exit R without saving workspace
> 4: exit R saving workspace
> Selection: dim(rs)
> Selection:
>
>
> ---------------------------------------------------------
> Guido Hooiveld, PhD
> Nutrition, Metabolism & Genomics Group Division of Human Nutrition
> Wageningen University Biotechnion, Bomenweg 2
> NL-6703 HD Wageningen
> the Netherlands
> tel: (+)31 317 485788
> fax: (+)31 317 483342
> email: guido.hooiveld at wur.nl
> internet: http://nutrigene.4t.com
> http://www.researcherid.com/rid/F-4912-2010
>
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
>
More information about the Bioconductor
mailing list