[R-sig-hpc] Parallel linear model

Norm Matloff matloff at cs.ucdavis.edu
Thu Aug 23 03:30:28 CEST 2012


Thanks for the correction.  I must say that I still have my nagging
doubts, though.  Is it true for older systems, say some Linuxes of 4-5
years ago, or BSD?

Norm

On Wed, Aug 22, 2012 at 09:06:45PM -0400, Simon Urbanek wrote:
> 
> On Aug 22, 2012, at 7:18 PM, Norm Matloff wrote:
> 
> > On Wed, Aug 22, 2012 at 06:03:36PM -0500, Paul Johnson wrote:
> > 
> >> This  is a great example and I would like to use it in class.  But I
> >> think I don't understand the implications of the system.time output
> >> you get.  I have a question about this below. Would you share your
> >> thoughts?...
> > 
> > Paul is bringing up a very important point here.
> > 
> > There are various OS dependencies that can really change things.  A
> > notable example is that if one calls something like mclapply(), the time
> > actually spent by the child R processes probably will NOT be counted in
> > the User time.
> 
> That is actually wrong. It is true for snow where the processes are separate, but most systems do account for child user time in mclapply:
> 
> # Linux
> > system.time(mclapply(1:32, function(x) for(i in 1:1e6) x+x, mc.cores=32))
>    user  system elapsed 
>  27.330   1.468   0.944 
> > system.time((function(x) for(i in 1:1e6) x+x)(1))           
>    user  system elapsed 
>   0.736   0.000   0.734 
> 
> # OS X
> > system.time(mclapply(1:16, function(x) for(i in 1:1e6) x+x, mc.cores=16))
>    user  system elapsed 
>   9.386   0.357   0.876 
> > system.time((function(x) for(i in 1:1e6) x+x)(1))           
>    user  system elapsed 
>   0.425   0.004   0.428 
> 
> Cheers,
> Simon
> 
> 
> 
> >  The latter will likely just measure how much time the
> > parent process spend in parceling out the work to the children, and in
> > collecting together the results.
> > 
> > You have the same problem on a cluster, where the worker processes set
> > up by clusterApply() or whatever aren't counted.
> > 
> > You could on the other hand have the opposite problem in some OSes,
> > where once gets the SUM of the times of the children.
> > 
> > Using Elapsed time might be a little crude, but generally good enough.
> > 
> > Norm
> > 
> > _______________________________________________
> > R-sig-hpc mailing list
> > R-sig-hpc at r-project.org
> > https://stat.ethz.ch/mailman/listinfo/r-sig-hpc
> > 
> > 
>



More information about the R-sig-hpc mailing list