[Rd] parallel::mclapply() dummy function on Windows?
John Fox
jfox at mcmaster.ca
Sat Oct 8 16:41:33 CEST 2011
Dear Brian,
> -----Original Message-----
> From: Prof Brian Ripley [mailto:ripley at stats.ox.ac.uk]
> Sent: October-08-11 9:57 AM
> To: John Fox
> Cc: 'Martin Morgan'; ttriche at usc.edu; 'r-devel'
> Subject: RE: [Rd] parallel::mclapply() dummy function on Windows?
>
> On Sat, 8 Oct 2011, John Fox wrote:
>
> > Dear Martin,
> >
> > I don't have an opinion about whether what Tim wants to do is a good
> > idea, but was responding to his comment that he would need
> > "parallel=FALSE flags all over the place." Why could he not simply
> > define
> >
> > mclapply <- if (.Platform$OS.type == "windows") base::lapply else
> > parallel::mclapply
> >
> > in his package?
>
> Because mclapply has additional arguments that would be passed by FUN to
> lapply as part of ... .
I did think of that and took a look at mclapply() before I responded. All of
the additional arguments occur after ... and have defaults. I assumed from
the original posting that Tim Triche is using the defaults (otherwise I
don't think he would have made his original suggestion), but even if he is
not, he could define mclapply() in his package as something like
mclapply <- if (.Platform$OS.type != "windows") parallel::mclapply
else function(X, FUN, ..., mc.preschedule = TRUE, mc.set.seed =
TRUE,
mc.silent = FALSE, mc.cores =
getOption("mc.cores", 2L),
mc.cleanup = TRUE, mc.allow.recursive =
TRUE))
base::lapply(X, FUN, ...)
As I said, I won't pretend that I know whether his general approach is
sound.
Best,
John
>
> We are contemplating having wrappers of mclapply and pvec on Windows
> equivalent to the behaviour with mc.cores = 1 on Unix. But that is
nothing
> to do with original specious claim to which I responded: if you want good
> parallel performance for most users you need also to support both
parLapply
> and mclapply (or at least, parLapply with a fork cluster).
>
> I think the import issue is a red herring: these functions are not called
> often enough for parallel::mclapply to be inefficient. And really
importFrom
> is only better practice for things that will always be used, since it
moves
> the computation from as-needed to every time the package is loaded.
>
> >
> > Best,
> > John
> >
> >> -----Original Message-----
> >> From: r-devel-bounces at r-project.org
> >> [mailto:r-devel-bounces at r-project.org]
> > On
> >> Behalf Of Martin Morgan
> >> Sent: October-08-11 8:16 AM
> >> To: John Fox
> >> Cc: ttriche at usc.edu; 'Prof Brian Ripley'; 'r-devel'
> >> Subject: Re: [Rd] parallel::mclapply() dummy function on Windows?
> >>
> >> On 10/07/2011 06:03 PM, John Fox wrote:
> >>> Dear Tim,
> >>>
> >>>> -----Original Message-----
> >>>> From: r-devel-bounces at r-project.org
> >>>> [mailto:r-devel-bounces at r-project.org]
> >>> On
> >>>> Behalf Of Tim Triche, Jr.
> >>>> Sent: October-07-11 3:05 PM
> >>>> To: Prof Brian Ripley
> >>>> Cc: r-devel
> >>>> Subject: Re: [Rd] parallel::mclapply() dummy function on Windows?
> >>>>
> >>>> On Thu, Oct 6, 2011 at 11:25 PM, Prof Brian Ripley
> >>>> <ripley at stats.ox.ac.uk>wrote:
> >>>>
> >>>>>
> >>>>> Why would it make it easier? And how could using a dummy for
> >>>>> 'most
> >>> users'
> >>>>> (who are on Windows) offer them 'good parallel support'?
> >>>>
> >>>>
> >>>> Good point. Most of my users are on unix, because my use of
> >>>> mclapply() is primarily to expedite processing of raw scanner data.
> >>>> Only a handful of users for the packages that call mclapply() are
> >>>> on Windows. Right now, I default to having parallel=FALSE flags
> >>>> all over the place, but I'd prefer
> >>> for
> >>>> the default to be "go as fast as practical in the common case",
> >>>> i.e.,
> >>> Unix.
> >>>> It would have been more accurate for me to say "I would like to
> >>> parallelize
> >>>> by default, without having the methods fail on Windows in the
> >>>> default configuration" than to claim that I want "good parallel
> >>>> support" for
> >>> Windows.
> >>>> When I have tried using the foreach/doMC combination in the past,
> >>>> it has
> >>> not
> >>>> worked out satisfactorily, so I don't know how well I can support
> >>>> Windows users... period.
> >>>
> >>> Why don't you just apply the approach you initially suggested in
> >>> your own package, defining mclapply() the way you want it?
> >>
> >> Hi John et al.,
> >>
> >> Individual packages will become littered with ad hoc solutions,
> > constructed
> >> without, for instance, the wisdom and experience of Prof.
> >> Ripley about platforms or environments in which it is appropriate to
> >> use mclapply. For instance, Tim's pseudo-code if (Windows) ...
> >> translated as
> > if
> >> (.Platform$OS.type == "windows") doesn't sound like its the correct
> >> test;
> > at
> >> least
> >>
> >> exists("mclapply", getNamespace("parallel"))
> >>
> >> but probably more. Also, doesn't parallel's name space differ between
> >> platforms, requiring the package author to import(parallel) rather
> >> than
> > the
> >> better practice of importFrom(parallel, mclapply) ?
> >>
> >> Martin
> >>
> >>>
> >>> I hope this helps,
> >>> John
> >>>
> >>>>
> >>>> Take a look at e.g. package 'boot' to see how to offer alternatives.
> >>>> (A
> >>>>> version that uses 'parallel' is pending on CRAN, or see
> >>>>> http://www.stats.ox.ac.uk/pub/**R/boot_1.3-3.tar.gz<http://www.sta
> >>>>> ts
> >>>>> .o
> >>>>> x.ac.uk/pub/R/boot_1.3-3.tar.gz>.) Package 'parallel' may in
> >>>>> future offer a higher-level abstraction layer that makes offers
> >>>>> such a choice,
> >>> but
> >>>> as the 'boot' code shows, deciding what to send to the workers in a
> >>>> snow- style cluster is not simple.
> >>>>>
> >>>>
> >>>> It seems similar to what I do (off topic: why do you use the file
> >>> extension
> >>>> '.q' for all of the R/S code files?): pass flags around. I suppose
> >>>> I was just being lazy, but I would love to default to "go as fast
> >>>> as
> >> possible"
> >>>> without having Windows users get left out in the cold (unless they
> >>>> add
> >>> flags
> >>>> to their function calls).
> >>>>
> >>>> Thank you for your suggestions, I will look into this further.
> >>>>
> >>>> --
> >>>> Tim Triche, Jr.
> >>>> USC Biostatistics
> >>>>
> >>>> [[alternative HTML version deleted]]
> >>>>
> >>>> ______________________________________________
> >>>> R-devel at r-project.org mailing list
> >>>> https://stat.ethz.ch/mailman/listinfo/r-devel
> >>>
> >>> ______________________________________________
> >>> R-devel at r-project.org mailing list
> >>> https://stat.ethz.ch/mailman/listinfo/r-devel
> >>
> >>
> >> --
> >> Computational Biology
> >> Fred Hutchinson Cancer Research Center
> >> 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109
> >>
> >> Location: M1-B861
> >> Telephone: 206 667-2793
> >>
> >> ______________________________________________
> >> R-devel at r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-devel
> >
>
> --
> Brian D. Ripley, ripley at stats.ox.ac.uk
> Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
> University of Oxford, Tel: +44 1865 272861 (self)
> 1 South Parks Road, +44 1865 272866 (PA)
> Oxford OX1 3TG, UK Fax: +44 1865 272595
More information about the R-devel
mailing list