[R] Building a big.matrix using foreach

Michael Knudsen micknudsen at gmail.com
Mon Jul 20 10:01:22 CEST 2009


On Sun, Jul 19, 2009 at 2:29 PM, Jay Emerson<jayemerson at gmail.com> wrote:

Hi Jay!

> foreach(i=1:nrow(x),.combine=c) %dopar% f(x[i,])

That was also my first guess, but it doesn't seem to work. Here is a
trivial example using a regular matrix instead of a big.matrix. The
outcome is the same.

m = matrix(0,nrow=5,ncol=5)
foreach (i=1:5) %dopar% { m[i,] = rnorm(5) }

Since I didn't include the .combine option, a list containing five
independent rnorm(5) is returned. However, the matrix m is not
changed. If I replace %dopar% with %do%, everything works fine (but
not in parallel, of course).

Another thing is: The reason why I want to use big.matrix is, of
course, that my data set is too big to store in a regular matrix.
However, it seems that no matter how you run foreach, it will always
return something (a list, a vector, or...), and that will end up
having the same dimension as the big.matrix. If the returned object
can't be a big.matrix, I'm bound to run out of memory anyway.

> should work, essentially applying the functin f() to the rows of x?  But
> perhaps I misunderstand you.  Please feel free to email me or Mike
> (michael.kane at yale.edu) directoy with questions about bigmemory, we are very
> interested in applications of it to real problems.

My acutal problem is the following: I have a big data set of
observations, and I have a distance measure on this set. I would like
to calculate all pairwise distances and store them in a big.matrix. My
hope was to be able to build the matrix row by row in a parallel way.

Best,
Michael

-- 
Michael Knudsen
micknudsen at gmail.com
http://lifeofknudsen.blogspot.com/




More information about the R-help mailing list