[R] retrieving large columns using RODBC
Tamas K Papp
tpapp at Princeton.EDU
Sat Aug 13 10:03:14 CEST 2005
Hi,
I have a large table in Postgresql (result of an MCMC simulation, with 1
million rows) and I would like to retrive colums (correspond to variables)
using RODBC. I have a column called "index" which is used to order rows.
Unfortunately, sqlQuery can't return all the values from a column at once
(RODBC complains about lack of memory). So I am using the following code:
getcolumns <- function(channel, tablename, colnames, totalrows,
ordered=TRUE,chunksize=1e5) {
r <- matrix(double(0),totalrows,length(colnames))
for (i in 1:ceiling(totalrows/chunksize)) {
cat(".")
r[((i-1)*chunksize+1):(i*chunksize)] <- as.matrix(
sqlQuery(channel, paste("SELECT", paste(colnames,collapse=", "),
"FROM", tablename,
"WHERE index <=", i*chunksize,
"AND index >", (i-1)*chunksize,
if (ordered) "ORDER BY index;" else ";")))
}
cat("\n")
drop(r) # convert to vector if needed
}
to retrieve it in chunks. However, this is very slow -- takes about 15
minutes on my machine. Is there a way to speed it up?
I am running Linux on a powerbook, RODBC version 1.1-4, R 2.1.1. The
machine has only 512 Mb of RAM.
Thanks,
Tamas
More information about the R-help
mailing list