[R] Problem parallelizing across cores
James Spottiswoode
j@me@ @end|ng |rom j@@@oc@com
Thu Aug 29 01:55:14 CEST 2019
> On Aug 28, 2019, at 4:44 PM, James Spottiswoode <james using jsasoc.com> wrote:
>
> Hi Bert,
>
> Thanks for your advice. Actually i’ve already done this and have checked out doParallel and future packages. The trouble with doParallel is that it forks R processes which spend a lot of time loading data and packages whereas my function runs in 100ms so the parallelization doesn’t help. The future package keeps it’s children running but I haven’t figured out how to get it to work in my application.
>
> Best — James
>
>
>> On Aug 28, 2019, at 3:39 PM, Bert Gunter <bgunter.4567 using gmail.com <mailto:bgunter.4567 using gmail.com>> wrote:
>>
>>
>> I would suggest that that you search on "parallel computing" at the Rseek.org <http://rseek.org/> site. This brought up what seemed to be many relevant hits including, of course, the High Performance and parallel Computing Cran task view.
>>
>> Cheers,
>> Bert
>>
>> Bert Gunter
>>
>> "The trouble with having an open mind is that people keep coming along and sticking things into it."
>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>>
>>
>> On Wed, Aug 28, 2019 at 3:18 PM James Spottiswoode <james.spottiswoode using gmail.com <mailto:james.spottiswoode using gmail.com>> wrote:
>> Hi All,
>>
>> I have a piece of well optimized R code for doing text analysis running
>> under Linux on an AWS instance. The code first loads a number of packages
>> and some needed data and the actual analysis is done by a function called,
>> say, f(string). I would like to parallelize calling this function across
>> the 8 cores of the instance to increase throughput. I have looked at the
>> packages doParallel and future but am not clear how to do this. Any method
>> that brings up an R instance when the function is called will not work for
>> me as the time to load the packages and data is comparable to the execution
>> time of the function leading to no speed up. Therefore I need to keep a
>> number of instances of the R code running continuously so that the data
>> loading only occurs once when the R processes are first started and
>> thereafter the function f(string) is ready to run in each instance. I hope
>> I have put this clearly.
>>
>> I’d much appreciate any suggestions. Thanks in advance,
>>
>> James Spottiswoode
>>
>>
>> --
>>
>> [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help using r-project.org <mailto:R-help using r-project.org> mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help <https://stat.ethz.ch/mailman/listinfo/r-help>
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html <http://www.r-project.org/posting-guide.html>
>> and provide commented, minimal, self-contained, reproducible code.
>
> James Spottiswoode
> Applied Mathematics & Statistics
> (310) 270 6220
> jamesspottiswoode Skype
> james using jsasoc.com <mailto:james using jsasoc.com>
James Spottiswoode
Applied Mathematics & Statistics
(310) 270 6220
jamesspottiswoode Skype
james using jsasoc.com
[[alternative HTML version deleted]]
More information about the R-help
mailing list