[R-sig-hpc] Using snow on a looping structure

Gang Chen gangchen6 at gmail.com
Thu Dec 4 23:23:54 CET 2008


Hi Martin,

Your suggestion really helps! It's exactly what I wanted. I really
appreciate it...

Regarding the array-munging part, the following will do:

b <- aperm(b, c(2,3,4,1))

I have a couple of related issues now:

(1) When running the following I get two warnings on my Mac OS X
10.4.11 (one from each processor, I guess):

> cl <- makeCluster(2, type = "SOCK")	
WARNING: ignoring environment value of R_HOME
WARNING: ignoring environment value of R_HOME

Why is this warning? How to correct it?

(2) Previously I could follow up the progress of the job by sticking
the following

print(format(Sys.time(), "%D %H:%M:%OS3"))

inside the outermost for loop (with ii index), but now with parallel
computing I couldn't find a similar way to trace the progress. Do you
or anybody know how to do that?

Thanks again,
Gang


On Wed, Dec 3, 2008 at 12:05 PM, Martin Morgan <mtmorgan at fhcrc.org> wrote:
> Hi Gang --
>
> "Gang Chen" <gangchen6 at gmail.com> writes:
>
>> I'm a newbie running parallel computing, so, sorry for this simple question.
>>
>> My original code without parallel computing is like this:
>>
>> runAna <- function(myData, Model, ...) {
>>       myStat <- wFun(myData, Model, ...)   # myStat: a vector with a
>> length of nStat
>>       return(myStat)
>> }
>>
>> rStat <- array(0, dim=c(dimx, dimy, dimz, nStat))
>> for (i in 1:dimx)
>> for (j in 1:dimy)
>> for (k in 1:dimz)
>>      rStat[i, j, k,] <- runAna(rData[i, j, k,], Model, ...)   # each
>> analysis is on the 4th dimension, and returns nStat numbers which are
>> stored in the 4th dimension of rStat
>
> I think what you want is along the lines of
>
>> a <- array(1:(2*3*4*5), c(2,3,4,5))
>> b <- apply(a, c(1,2,3), range)
>
> and then as you guessed
>
>> library(snow)
>> cl <- makeCluster(nNodes, type="SOCK")
>> d <- parApply(cl, a, c(1,2,3), range)
>> identical(b, d)
> [1] TRUE
>
> so for your example, I'd guess
>
> runStat <- parApply(cl, rData, c(1,2,3), runAna, Model=Model)
>
> This is not quite what you want -- the 'result' dimension is the first
> rather than last
>
>> dim(a)
> [1] 2 3 4 5
>> dim(b)
> [1] 2 2 3 4
>
> array-munging is not a speciality of mine, but a simple work-around is
> to reorder the dimensions of the original array, so you're applying
> to, and writing in, the slice indexed by the first entry
>
>> a <- array(1:(5*2*3*4), c(5,2,3,4))
>> b <- apply(a, c(2,3,4), range)
>> dim(a)
> [1] 5 2 3 4
>> dim(b)
> [1] 2 2 3 4
>> d <- parApply(cl, a, c(2,3,4), range)
>
> Hope that helps,
>
> Martin
>
>> I'm trying to run the above analysis using snow on a machine with two
>> processors, but could not figure out how to correctly set it up:
>>
>> nNodes <- 2
>> library(snow)
>> cl <- makeCluster(nNodes, type = "SOCK")
>>
>> I thought I would use parApply, but how should I combine the looping
>> with parApply? Or no looping at all with something like parApply(cl,
>> rData, c(1,2,3), ...)?
>>
>> Thanks in advance,
>> Gang
>>
>> _______________________________________________
>> R-sig-hpc mailing list
>> R-sig-hpc at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/r-sig-hpc
>
> --
> Martin Morgan
> Computational Biology / Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N.
> PO Box 19024 Seattle, WA 98109
>
> Location: Arnold Building M2 B169
> Phone: (206) 667-2793



More information about the R-sig-hpc mailing list