[R-sig-hpc] The foreach, iterators and doMC packages

Fri Jul 17 17:21:19 CEST 2009

Mark,

I've been playing with your example a bit more, and thought
that I should mention that splitting up matrices (or data frames)
as you're doing can be a bit tricky.  Mostly it has to do with
handling special cases properly, like when the number of rows
is less than the number of cores, and other things that don't
happen often, but are confusing when they do.

My plan has been to provide tools in the iterators package to
help with those kinds of tasks.  There isn't a lot yet, but the
idiv function can actually be used to help provide a different
solution to your problem.  I'm using the idiv function to help
implement a function that returns a "block row" iterator which
works on both matrices and data frames:

iblkrow <- function(a, ...) {
  i <- 1
  it <- idiv(nrow(a), ...)

  nextEl <- function() {
    n <- nextElem(it)
    r <- seq(i, length=n)
    i <<- i + n
    a[r,, drop=FALSE]
  }

  obj <- list(nextElem=nextEl)
  class(obj) <- c('abstractiter', 'iter')
  obj
}

This can be used with foreach for your example as follows:

foreach(x=iblkrow(x.mat, chunks=Ncore), y=iblkrow(y.mat, chunks=Ncore),
              .combine='rbind') %dopar%
    do.par.test.called.func(x, y)

This creates two iterators, one for each matrix, and completely handles
the indexing in the iterator itself.  This approach is also more efficient
if you're using a distributed memory parallel backend, because less
data is being sent over the network, but that isn't an issue when using
doMC.

Also note that idiv takes either a "chunks" or "chunkSize" argument,
which is passed along to iblkrow.  That allows you to specify either
the number of pieces to split the matrix into (using chunks, as in
this example), or the number of rows per task (using chunkSize,
which is sometimes useful).

I hope this doesn't seem overwhelming.  You don't have to use
iterators with foreach, but they are powerful and can be very helpful
at times. You might also want to take a look at David Smith's blog,
which is on iterators today.

- Steve

On Thu, Jul 16, 2009 at 1:34 PM, Mark Kimpel<mwkimpel at gmail.com> wrote:
> Steve,
>
> I used a non-trivial example and got a nice speed boost, so thanks for
> that advice.
>
> Now I need advice on a real-world application. I am working with a
> genomic data-set that involves a lot of calculations on a data-frame
> as well as parallel sub-settting of an annotation data-frame. The
> data.frames initially have about 30k rows and up to 100 columns. If,
> for example, I have 5 cores to work with, it seems to me that the most
> efficient way, rather than making repeated calls to the cores, would
> be to parcel things out by the number of rows divided by the number of
> cores. That would mean sending data.frames to the functions that
> do.par calls rather than vectors. Below is some a self-contained
> example. It doesn't work because i doesn't get incremented after the
> %dopar% operator.
>
> I'm new to the parallel processing world, so if I am not taking the
> right approach let me know, or, if another package might work better.
>
> Thanks,
> Mark
>
> ###########################################################
> require(foreach)
> require(multicore)
> require(doMC)
>
> Ncore <- 5
> registerDoMC(cores=Ncore)
> Ncol <- 300
> Nrow <- 500
>
> do.par.test.called.func <- function(Mat1, Mat2){
>  Mat3 <- Mat1 + Mat2
>  for(i in 1:nrow(Mat3)){
>    for (j in 1:ncol(Mat3)){
>      Mat3[i,j] <- sqrt(abs(Mat3[i,j]))
>    }
>  }
>  Mat3
> }
>
> do.par.test.func <- function(Nrow, Ncol, Ncore){
>  x.mat <- matrix(rnorm(Ncol * Nrow), Nrow, Ncol)
>  y.mat <- matrix(rnorm(Ncol * Nrow), Nrow, Ncol)
>  N <- ceiling(Nrow/Ncore)
>  b <- foreach(i = 1:Ncore) %dopar% do.par.test.called.func(x.mat[(i *
> N): (((i + 1) * N) -1),], y.mat[(i * N): (((i + 1) * N) -1),])
>  b
> }
> #
> do.par.out <- do.par.test.func(Nrow, Ncol, Ncore)
> ############################################
>
> ------------------------------------------------------------
> Mark W. Kimpel MD  ** Neuroinformatics ** Dept. of Psychiatry
> Indiana University School of Medicine
>
> 15032 Hunter Court, Westfield, IN  46074
>
> (317) 490-5129 Work, & Mobile & VoiceMail
>
> "The real problem is not whether machines think but whether men do."
> -- B. F. Skinner
> ******************************************************************
>
>
>
> On Wed, Jul 15, 2009 at 1:50 PM, Steve
> Weston<steve at revolution-computing.com> wrote:
>> And you can get much better results using sqrt directly on the vector,
>> as it is intended to be called.  The sqrt function is so fast that it's
>> crazy to call it in parallel the way my vignette shows.  The
>> overhead kills you.  But I have this bad habit of using sqrt in my
>> parallel programming examples.  I actually put a footnote in the
>> vignette saying you should never do what I just did, but I can see
>> now that I wasn't nearly explicit enough.  I should probably include
>> a big disclaimer saying not to try this at home.
>>
>> I'm pretty sure that the next version of foreach will issue warnings
>> if the tasks execute in less than a second.  But I'll definitely fix
>> the vignette.  Sorry about that.
>>
>> - Steve
>>
>>
>>
>> On Wed, Jul 15, 2009 at 12:54 PM, Mark Kimpel<mwkimpel at gmail.com> wrote:
>>> I'm having trouble getting these packages to work as I believe they
>>> should. I have a new Debian Lenny box running on an Intel i7 with 12
>>> GB of memory and wrote a test script to see how much performance
>>> increase I could achieve with foreach. For this purpose, I used code
>>> extracted from the vignette. Below is the code, the system.time
>>> output, and sessionInfo(). I should add that I've run this multiple
>>> times and always achieved similar results. These last were achieved
>>> after doing init 1 to get me into a strict terminal mode and avoid the
>>> GUI.
>>>
>>> As you can see, the results seem to be the opposite of what one would
>>> expect. The fastest time, by an order of magnitude, is achieved by a
>>> simple for loop, and %do% slightly outperforms %dopar%
>>>
>>> How can this be explained?
>>> Mark
>>> ####################################
>>> require(foreach)
>>> require(multicore)
>>> require(doMC)
>>> registerDoMC(cores=5)
>>> z <- 30000
>>>
>>>
>>> for.each.do.time <- system.time(
>>>            a <- foreach(i = 1:z, .combine = "c") %do% sqrt(i)
>>>            )
>>> #
>>> for.each.do.par.time <- system.time(
>>>            b <- foreach(i = 1:z, .combine = "c") %dopar% sqrt(i)
>>>            )
>>>
>>> #
>>> c <- rep(0,z)
>>> loop.time <- system.time(
>>>            for (i in 1:length(c))
>>>            c[i] <- sqrt(i)
>>>            )
>>> #
>>> out <- rbind(unclass(for.each.do.time), unclass(for.each.do.par.time),
>>> unclass(loop.time))
>>> out
>>> "user.self"     "sys.self"      "elapsed"       "user.child"    "sys.child"
>>> 25.713  0       25.712  0       0
>>> 25.918  0.016   26.015  0.192   0.192
>>> 0.207999999999998       0       0.206000000000003       0       0
>>> #
>>> "R.version.platform" "x86_64-unknown-linux-gnu"
>>> "R.version.arch" "x86_64"
>>> "R.version.os" "linux-gnu"
>>> "R.version.system" "x86_64, linux-gnu"
>>> "R.version.status" ""
>>> "R.version.major" "2"
>>> "R.version.minor" "9.1"
>>> "R.version.year" "2009"
>>> "R.version.month" "06"
>>> "R.version.day" "26"
>>> "R.version.svn rev" "48839"
>>> "R.version.language" "R"
>>> "R.version.version.string" "R version 2.9.1 (2009-06-26)"
>>> "locale" "LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C"
>>> "basePkgs1" "stats"
>>> "basePkgs2" "graphics"
>>> "basePkgs3" "grDevices"
>>> "basePkgs4" "datasets"
>>> "basePkgs5" "utils"
>>> "basePkgs6" "methods"
>>> "basePkgs7" "base"
>>>
>>>
>>>
>>> ------------------------------------------------------------
>>> Mark W. Kimpel MD  ** Neuroinformatics ** Dept. of Psychiatry
>>> Indiana University School of Medicine
>>>
>>> 15032 Hunter Court, Westfield, IN  46074
>>>
>>> (317) 490-5129 Work, & Mobile & VoiceMail
>>>
>>> "The real problem is not whether machines think but whether men do."
>>> -- B. F. Skinner
>>> ******************************************************************
>>>
>>>
>>>
>>> On Wed, Jul 1, 2009 at 10:56 PM, Steve
>>> Weston<steve at revolution-computing.com> wrote:
>>>> There have been several announcements of three new packages that I've
>>>> recently uploaded to CRAN: foreach, iterators, and doMC.  You can read
>>>> one description on David Smith's blog, at:
>>>>
>>>>    http://blog.revolution-computing.com
>>>> or:
>>>>    http://bit.ly/tygLz
>>>>
>>>> You can also read the vignette that I wrote for foreach on a CRAN
>>>> website.
>>>>
>>>> I would like to mention that one of the goals of the foreach package is
>>>> to make it easy to write an R package that allows the end user to choose
>>>> what parallel computing engine to use.  That's useful because the user
>>>> may already have and use a parallel computing system, and not want to
>>>> install and maintain yet another one.  (The foreach and iterators packages
>>>> themselves are trivial to install, since they provide a framework for using
>>>> parallel computing systems such as multicore and nws.)
>>>>
>>>> Currently, only the doMC "parallel backend" for multicore is publicly
>>>> available on CRAN, but I'm hoping to get a chance to write and release
>>>> other backend packages, to support MPI and shared memory, for example.
>>>>
>>>> I'd love to hear from R package authors on how to improve foreach
>>>> so that it's a more attractive platform on which to develop parallel
>>>> applications.  Especially if you've had difficulty using parallel
>>>> computing systems in the past.
>>>>
>>>> --
>>>> Steve Weston
>>>> REvolution Computing
>>>> One Century Tower | 265 Church Street, Suite 1006
>>>> New Haven, CT  06510
>>>> P: 203-777-7442 x266 | www.revolution-computing.com
>>>>
>>>> _______________________________________________
>>>> R-sig-hpc mailing list
>>>> R-sig-hpc at r-project.org
>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-hpc
>>>>
>>>
>>
>>
>>
>> --
>> Steve Weston
>> REvolution Computing
>> One Century Tower | 265 Church Street, Suite 1006
>> New Haven, CT  06510
>> P: 203-777-7442 x266 | www.revolution-computing.com
>>
>

-- 
Steve Weston
REvolution Computing
One Century Tower | 265 Church Street, Suite 1006
New Haven, CT  06510
P: 203-777-7442 x266 | www.revolution-computing.com