[R] cbind alternate
Rui Barradas
ruipbarradas at sapo.pt
Fri Jan 6 19:57:04 CET 2012
Hello,
I believe this function can handle a problem of that size, or bigger.
It does NOT create the full matrix, just writes it to a file, a certain
number of lines at a time.
write.big.matrix <- function(x, y, outfile, nmax=1000){
if(file.exists(outfile)) unlink(outfile)
testf <- file(outfile, "at") # or "wt" - "write text"
on.exit(close(testf))
step <- nmax # how many at a time
inx <- seq(1, length(x), by=step) # index into 'x' and 'y'
mat <- matrix(0, nrow=step, ncol=2) # create a work matrix
# do it 'nmax' rows per iteration
for(i in inx){
mat <- cbind(x[i:(i+step-1)], y[i:(i+step-1)])
write.table(mat, file=testf, quote=FALSE, row.names=FALSE,
col.names=FALSE)
}
# and now the remainder
mat <- NULL
mat <- cbind(x[(i+1):length(x)], y[(i+1):length(y)])
write.table(mat, file=testf, quote=FALSE, row.names=FALSE, col.names=FALSE)
# return the output filename
outfile
}
x <- 1:1e6 # a numeric vector
y <- sample(letters, 1e6, replace=TRUE) # and a character vector
length(x);length(y) # of the same length
fl <- "test.txt" # output file
system.time(write.big.matrix(x, y, outfile=fl))
On my system it takes (sample output)
user system elapsed
1.59 0.04 1.65
and can handle different types of data. In the example, numeric and
character.
If you also need the matrix, try to use 'cbind' first, without writing to a
file.
If it's still slow, adapt the code above to keep inserting chunks in an
output matrix.
Rui Barradas
--
View this message in context: http://r.789695.n4.nabble.com/cbind-alternate-tp4270188p4270444.html
Sent from the R help mailing list archive at Nabble.com.
More information about the R-help
mailing list