[R-sig-DB] R crashes while executing sqlSave from RODBC to PostgreSQL dsn inside a for loop - debug help
Paul DeBruicker
pdebru|c @end|ng |rom gm@||@com
Tue Jan 22 23:11:55 CET 2008
Hello List
I have a reproducible example at the bottom of this message, if you
have PostgreSQL set up as a data source in Windows. I'm just looking
for ideas on how to debug a problem that crashes R with the following
Windows error messages:
first error msg:
Microsoft Visual C++ Runtime Library
Runtime Error!
Program: C:\Program Files\R\R-2.6.1\bin\Rgui.exe
This application has requested the Runtime to terminate it in an
unusual way. Please contact the application's support team for more
information.
Click "OK" then get the second error msg:
RGui: Rgui.exe - Application Error
The instruction at "0x00000001" referenced memory at "0x00000001".
The memory could not be "read".
Click on OK to terminate the program
I'm using the latest released RODBC, R, and PostgreSQL ANSI ODBC
driver to connect to a PostgreSQL 8.2.6 database over a LAN. Both
computers are running Windows XP.
The for loop in question uses read.csv to import 730 csv files with 55
columns each, averaging 1500 lines per file. On any given row the
majority of columns are blank (~80%). I declare the column classes
for the imported file and varTypes for the database.
Since R crashes I do not know how to use traceback() or debugger() on
the crashed instance. My questions are:
1. How can I debug this problem?
2. Should I learn/use another language to regularly import this much
data into a Postgres DB?
3. Is there another opensource database that might be more appropriate
to use with R?
4. I know that the RODBC manual mentions that PostgreSQL ODBC driver
has generated internal memory corruption if addPK=TRUE. Is my problem
another symptom of that problem?
5. Should I send this to another mailing list or file a bug report?
I've been able to recreate the crash using the code below on 3
machines in the office, so I don't think its specific to my computer
or setup.
Thanks for any guidance you can provide
Paul DeBruicker
R Code:
# the "asdf" table in the database does not exist and is created by
# the first sqlSave statement. Generally when I check the
# PostgreSQL database after R crashes, the first one or two
# files have been successfully imported, but no others.
#
# con is the odbcConnect return value
# vt is the varType list and is the same as what my csv files have
# df1 is the data.frame used in this test example in lieu of my data
# files. It has the same number and type of columns (55) and the
# average number of rows (1500).
#
library(RODBC)
con<-odbcConnect("PostgresDB",uid="paul",pwd="paul")
vt<-c("varchar(3)", "varchar(71)", "varchar(5)", "varchar(13)",
"varchar(9)", "varchar(9)", "varchar(9)", "float8", "varchar(2)",
"varchar(5)",
"varchar(13)", "float8", "float8", "float8", "varchar(2)", "float8",
"float8", "varchar(16)", "float8", "int4", "float8", "float8",
"varchar(13)",
"varchar(2)", "float8", "varchar(4)", "varchar(2)", "float8",
"varchar(2)", "varchar(4)", "varchar(16)", "varchar(16)",
"varchar(16)", "varchar(4)",
"varchar(4)", "varchar(4)", "float8", "float8", "float8",
"varchar(4)", "varchar(9)", "varchar(2)", "float8", "float8",
"varchar(9)", "float8",
"varchar(9)", "varchar(73)", "varchar(16)", "varchar(11)",
"varchar(9)", "varchar(9)", "varchar(2)", "varchar(15)", "serial")
names(vt)<-paste("v",1:55,sep="")
for(i in 1:730){
#create the big data frame
df1<-data.frame(v1=rep("a",1500),v2=rep("a",1500),v3=rep("a",1500),
v4=rep("a",1500),v5=rep("a",1500)
,v6=rep("a",1500),v7=rep("a",1500),v8=rep(1.25,1500),v9=rep("a",1500)
,v10=rep("a",1500),v11=rep("a",1500),v12=rep(1.25,1500),v13=rep(1.25,1500)
,v14=rep(1.25,1500),v15=rep("a",1500),v16=rep(1.25,1500),v17=rep(1.25,1500)
,v18=rep("a",1500),v19=rep(1.25,1500),v20=rep(1,1500),v21=rep(1.25,1500)
,v22=rep(1.25,1500),v23=rep("a",1500),v24=rep("a",1500),v25=rep(1.25,1500)
,v26=rep("a",1500),v27=rep("a",1500),v28=rep(1.25,1500),v29=rep("a",1500)
,v30=rep("a",1500),v31=rep("a",1500),v32=rep("a",1500),v33=rep("a",1500)
,v34=rep("a",1500),v35=rep("a",1500),v36=rep("a",1500),v37=rep(1.25,1500)
,v38=rep(1.25,1500),v39=rep(1.25,1500),v40=rep("a",1500),v41=rep("a",1500)
,v42=rep("a",1500),v43=rep(1.25,1500),v44=rep(1.25,1500),v45=rep("a",1500)
,v46=rep(1.25,1500),v47=rep("a",1500),v48=rep("a",1500),v49=rep("a",1500)
,v50=rep("a",1500),v51=rep("a",1500),v52=rep("a",1500),v53=rep("a",1500)
,v54=rep("a",1500),v55=rep(1,1500))
#save it to the PostgreSQL database
sqlSave(con,df1,"asdf",append=TRUE,rownames=FALSE,varType=vt)
# clean up after yourself
df1<-0
gc()
}
More information about the R-sig-DB
mailing list