[R-sig-hpc] how to work with big matrices and the ff-package?

Anne Skoeries home at anne-skoeries.de
Wed Apr 14 17:06:28 CEST 2010


Hello everyone, 

I need to create and work with some big matrices that actually have somewhat over 2 million columns and 117 rows. To do some calculations on such big matrices R just needs too much memory for my PC (4GB installed). So I need a solution to work with large datasets. I'm trying to use the ff-package but I  don't think I really understand the whole functionality of the package. Hopefully someone can help me either with the ff-package or a different solution.


I am saving some calculated matrices as ff-objects as follows:

require(ff)
nr <- 117; nc <- 50
dat <- sample(0:100, size=(nr*nc), replace=TRUE)
a <- matrix(dat, nrow=nr)

ncols <- (nc*(nc-1))/2
b <- ff(vmode="double", dim=c(nr, ncols))
namb <- vector(mode="character", length=ncols)
x <- 1
for(i in 1:(nc-1)){
	for(j in (i+1):nc){
		b[,x] <- a[,i]+a[,j]
		namb[x] <- paste(i, "_", j, sep="")
		x <- x+1
	}
}
dimnames(b)[[2]] <- namb

After the above step I need to convert my ff_matrix to a data.frame to discretize the whole matrix and calculate the mutual information. The calculated result should be saved as an ffdf-object or something similar.

require(infotheo)
disc <- as.ffdf(discretize(as.data.frame(as.ffdf(cc)), disc="equalwidth", nbins=5))

This won't work. After this step it somehow loses the path to the working directory. As soon as I try to discretize the next data.frame I get the following message:
Error in if (dfile == getOption("fftempdir")) finalizer <- "delete" else finalizer <- "close" : 
 Argument has length 0
Error in setwd(cwd) : character as argument expected

I would be really glad if anybody can help me understanding the functionality and show me how to convert between the different data types. 

Thanks in advance, 
Anne S.



More information about the R-sig-hpc mailing list