[Rd] parallel::mclapply() dummy function on Windows?
Prof Brian Ripley
ripley at stats.ox.ac.uk
Sat Oct 8 15:56:35 CEST 2011
On Sat, 8 Oct 2011, John Fox wrote:
> Dear Martin,
>
> I don't have an opinion about whether what Tim wants to do is a good idea,
> but was responding to his comment that he would need "parallel=FALSE flags
> all over the place." Why could he not simply define
>
> mclapply <- if (.Platform$OS.type == "windows") base::lapply else
> parallel::mclapply
>
> in his package?
Because mclapply has additional arguments that would be passed by FUN
to lapply as part of ... .
We are contemplating having wrappers of mclapply and pvec on Windows
equivalent to the behaviour with mc.cores = 1 on Unix. But that is
nothing to do with original specious claim to which I responded: if
you want good parallel performance for most users you need also to
support both parLapply and mclapply (or at least, parLapply with a
fork cluster).
I think the import issue is a red herring: these functions are not
called often enough for parallel::mclapply to be inefficient. And
really importFrom is only better practice for things that will always
be used, since it moves the computation from as-needed to every time
the package is loaded.
>
> Best,
> John
>
>> -----Original Message-----
>> From: r-devel-bounces at r-project.org [mailto:r-devel-bounces at r-project.org]
> On
>> Behalf Of Martin Morgan
>> Sent: October-08-11 8:16 AM
>> To: John Fox
>> Cc: ttriche at usc.edu; 'Prof Brian Ripley'; 'r-devel'
>> Subject: Re: [Rd] parallel::mclapply() dummy function on Windows?
>>
>> On 10/07/2011 06:03 PM, John Fox wrote:
>>> Dear Tim,
>>>
>>>> -----Original Message-----
>>>> From: r-devel-bounces at r-project.org
>>>> [mailto:r-devel-bounces at r-project.org]
>>> On
>>>> Behalf Of Tim Triche, Jr.
>>>> Sent: October-07-11 3:05 PM
>>>> To: Prof Brian Ripley
>>>> Cc: r-devel
>>>> Subject: Re: [Rd] parallel::mclapply() dummy function on Windows?
>>>>
>>>> On Thu, Oct 6, 2011 at 11:25 PM, Prof Brian Ripley
>>>> <ripley at stats.ox.ac.uk>wrote:
>>>>
>>>>>
>>>>> Why would it make it easier? And how could using a dummy for 'most
>>> users'
>>>>> (who are on Windows) offer them 'good parallel support'?
>>>>
>>>>
>>>> Good point. Most of my users are on unix, because my use of
>>>> mclapply() is primarily to expedite processing of raw scanner data.
>>>> Only a handful of users for the packages that call mclapply() are on
>>>> Windows. Right now, I default to having parallel=FALSE flags all
>>>> over the place, but I'd prefer
>>> for
>>>> the default to be "go as fast as practical in the common case", i.e.,
>>> Unix.
>>>> It would have been more accurate for me to say "I would like to
>>> parallelize
>>>> by default, without having the methods fail on Windows in the default
>>>> configuration" than to claim that I want "good parallel support" for
>>> Windows.
>>>> When I have tried using the foreach/doMC combination in the past, it
>>>> has
>>> not
>>>> worked out satisfactorily, so I don't know how well I can support
>>>> Windows users... period.
>>>
>>> Why don't you just apply the approach you initially suggested in your
>>> own package, defining mclapply() the way you want it?
>>
>> Hi John et al.,
>>
>> Individual packages will become littered with ad hoc solutions,
> constructed
>> without, for instance, the wisdom and experience of Prof.
>> Ripley about platforms or environments in which it is appropriate to use
>> mclapply. For instance, Tim's pseudo-code if (Windows) ... translated as
> if
>> (.Platform$OS.type == "windows") doesn't sound like its the correct test;
> at
>> least
>>
>> exists("mclapply", getNamespace("parallel"))
>>
>> but probably more. Also, doesn't parallel's name space differ between
>> platforms, requiring the package author to import(parallel) rather than
> the
>> better practice of importFrom(parallel, mclapply) ?
>>
>> Martin
>>
>>>
>>> I hope this helps,
>>> John
>>>
>>>>
>>>> Take a look at e.g. package 'boot' to see how to offer alternatives.
>>>> (A
>>>>> version that uses 'parallel' is pending on CRAN, or see
>>>>> http://www.stats.ox.ac.uk/pub/**R/boot_1.3-3.tar.gz<http://www.stats
>>>>> .o
>>>>> x.ac.uk/pub/R/boot_1.3-3.tar.gz>.) Package 'parallel' may in future
>>>>> offer a higher-level abstraction layer that makes offers such a
>>>>> choice,
>>> but
>>>> as the 'boot' code shows, deciding what to send to the workers in a
>>>> snow- style cluster is not simple.
>>>>>
>>>>
>>>> It seems similar to what I do (off topic: why do you use the file
>>> extension
>>>> '.q' for all of the R/S code files?): pass flags around. I suppose I
>>>> was just being lazy, but I would love to default to "go as fast as
>> possible"
>>>> without having Windows users get left out in the cold (unless they
>>>> add
>>> flags
>>>> to their function calls).
>>>>
>>>> Thank you for your suggestions, I will look into this further.
>>>>
>>>> --
>>>> Tim Triche, Jr.
>>>> USC Biostatistics
>>>>
>>>> [[alternative HTML version deleted]]
>>>>
>>>> ______________________________________________
>>>> R-devel at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>
>>> ______________________________________________
>>> R-devel at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>>
>> --
>> Computational Biology
>> Fred Hutchinson Cancer Research Center
>> 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109
>>
>> Location: M1-B861
>> Telephone: 206 667-2793
>>
>> ______________________________________________
>> R-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
More information about the R-devel
mailing list