[R-sig-hpc] Parallel linear model

Dirk Eddelbuettel edd at debian.org
Thu Aug 23 15:14:50 CEST 2012


On 22 August 2012 at 23:22, Norm Matloff wrote:
| 
| In rereading your posting now, Dirk, I suddenly realized that there is
| one aspect of this that I'd forgotten about:  An ordinary call to
| system.time() does not display all the information returned by that
| function!
| 
| That odd statement is of course due to the fact that the print
| method for objects of class proc_time displays only 3 of the 5 numbers.
| If one actually looks at the 5 numbers individually, you can separate
| the time of the parent process from the sum of the child times.  That
| separation is apparently what rbenchmark gives you, right?
| 
| As I said earlier, the quick-and-dirty way to handle this is to use the
| Elapsed time, typically good enough (say on a dedicated machine).  After
| all, if we are trying to develop a fast parallel algorithm, what the
| potential users of the algorithm care about is essentially the Elapsed
| time.

That seems fair in most cases.

| But at the other extreme, a very fine timing goal might be to try to
| compute what is called the makespan, which in this case would be the
| maximum of all the child times, rather than the sum of the child times.
| I say "try," because I don't see any systems way to accomplish this,
| short of inserting calls to something like clock_gettime() inside each
| thread.

Maybe you could look at what microbenchmark does [ as it covers all the
OS-level dirty work ] and see if it generalizes to multiple machines?

Dirk
 
| Norm
| 
| On Wed, Aug 22, 2012 at 07:53:02PM -0500, Dirk Eddelbuettel wrote:
| > 
| > The difference between user and elapsed is an old hat. Here is a great
| > example (and IIRC first shown here by Simon) with no compute time:
| > 
| >    R> system.time(mclapply(1:8, function(x) Sys.sleep(1)))   ## 2 cores by default
| >       user  system elapsed 
| >      0.000   0.012   4.014 
| >    R> system.time(mclapply(1:8, function(x) Sys.sleep(1), mc.cores=8))
| >       user  system elapsed 
| >      0.012   0.020   1.039 
| >    R> 
| > 
| > so elapsed time is effectively the one second a Sys.sleep(1) takes, plus
| > overhead, if we allow for all eight (hyperthreaded) cores here.  By Brian
| > Ripley's choice a default of two is baked-in, so clueless users only get a
| > small gain.  "user time" is roughly the actual system load _summed over all
| > processes / threads_.
| > 
| > With that, could I ask any of the participants in the thread to re-try with a
| > proper benchmarking package such as rbenchmark or microbenchmark?  Either one
| > beats to the socks of system.time:
| > 
| >    R> library(rbenchmark)
| >    R> benchmark( mclapply(1:8, function(x) Sys.sleep(1)), mclapply(1:8, function(x) Sys.sleep(1), mc.cores=8), replications=1)
| >                                                       test replications elapsed relative user.self sys.self user.child sys.child
| >    1               mclapply(1:8, function(x) Sys.sleep(1))            1   4.013  3.89612     0.000    0.008      0.000     0.004
| >    2 mclapply(1:8, function(x) Sys.sleep(1), mc.cores = 8)            1   1.030  1.00000     0.004    0.008      0.004     0.000
| >    R> 
| > 
| > and
| > 
| >    R> library(microbenchmark)
| >    R> microbenchmark( mclapply(1:8, function(x) Sys.sleep(1)), mclapply(1:8, function(x) Sys.sleep(1), mc.cores=8), times=1)
| >    Unit: seconds
| >                                                       expr     min      lq  median      uq     max
| >    1               mclapply(1:8, function(x) Sys.sleep(1)) 4.01377 4.01377 4.01377 4.01377 4.01377
| >    2 mclapply(1:8, function(x) Sys.sleep(1), mc.cores = 8) 1.03457 1.03457 1.03457 1.03457 1.03457
| >    R> 
| > 
| > (and you normally want to run either with 10 or 100 or ... replications /
| > times).
| > 
| > Dirk
| > 
| > -- 
| > Dirk Eddelbuettel | edd at debian.org | http://dirk.eddelbuettel.com
| > 
| > _______________________________________________
| > R-sig-hpc mailing list
| > R-sig-hpc at r-project.org
| > https://stat.ethz.ch/mailman/listinfo/r-sig-hpc
| 
| _______________________________________________
| R-sig-hpc mailing list
| R-sig-hpc at r-project.org
| https://stat.ethz.ch/mailman/listinfo/r-sig-hpc

-- 
Dirk Eddelbuettel | edd at debian.org | http://dirk.eddelbuettel.com



More information about the R-sig-hpc mailing list