[R] project parallel help
Jeffrey Flint
jeffrey.flint at gmail.com
Tue Oct 15 20:42:57 CEST 2013
How can I copy distinct blocks of data to each process?
On Mon, Oct 14, 2013 at 10:21 PM, Jeff Newmiller
<jdnewmil at dcn.davis.ca.us> wrote:
> The session info is helpful. To the best of my knowledge there is no easy way to share memory between R processes other than forking. You can use clusterExport to make "global" copies of large data structures in each process and pass index values to your function to reduce copy costs at a price of extra data copies in each process that won't be used. Or you can copy distinct blocks of data to each process and use single threaded processing to loop over the blocks within the workers to reduce the number of calls to workers. However I don't claim to be an expert with the parallel package, so others may have better advice. However, with two cores I don't usually get better than a 30% speedup... the best payoff comes with four or more workers working.
> ---------------------------------------------------------------------------
> Jeff Newmiller The ..... ..... Go Live...
> DCN:<jdnewmil at dcn.davis.ca.us> Basics: ##.#. ##.#. Live Go...
> Live: OO#.. Dead: OO#.. Playing
> Research Engineer (Solar/Batteries O.O#. #.O#. with
> /Software/Embedded Controllers) .OO#. .OO#. rocks...1k
> ---------------------------------------------------------------------------
> Sent from my phone. Please excuse my brevity.
>
> Jeffrey Flint <jeffrey.flint at gmail.com> wrote:
>>Jeff:
>>
>>Thank you for your response. Please let me know how I can
>>"unhandicap" my question. I tried my best to be concise. Maybe this
>>will help:
>>
>>> version
>> _
>>platform i386-w64-mingw32
>>arch i386
>>os mingw32
>>system i386, mingw32
>>status
>>major 3
>>minor 0.2
>>year 2013
>>month 09
>>day 25
>>svn rev 63987
>>language R
>>version.string R version 3.0.2 (2013-09-25)
>>nickname Frisbee Sailing
>>
>>
>>I understand your comment about forking. You are right that forking
>>is not available on windows.
>>
>>What I am curious about is whether or not I can direct the execution
>>of the parallel package's functions to diminish the overhead. My
>>guess is that there is overhead in copying the function to be executed
>>at each iteration and there is overhead in copying the data to be used
>>at each iteration. Are there any paradigms in the package parallel to
>>reduce these overheads? For instance, I could use clusterExport to
>>establish the function to be called. But I don't know if there is a
>>technique whereby I could point to the data to be used by each CPU so
>>as to prevent a copy.
>>
>>Jeff
>>
>>
>>
>>On Mon, Oct 14, 2013 at 2:35 PM, Jeff Newmiller
>><jdnewmil at dcn.davis.ca.us> wrote:
>>> Your question misses on several points in the Posting Guide so any
>>answers are handicapped by you.
>>>
>>> There is an overhead in using parallel processing, and the value of
>>two cores is marginal at best. In general parallel by forking is more
>>efficient than parallel by SNOW, but the former is not available on all
>>operating systems. This is discussed in the vignette for the parallel
>>package.
>>>
>>---------------------------------------------------------------------------
>>> Jeff Newmiller The ..... ..... Go
>>Live...
>>> DCN:<jdnewmil at dcn.davis.ca.us> Basics: ##.#. ##.#. Live
>>Go...
>>> Live: OO#.. Dead: OO#..
>>Playing
>>> Research Engineer (Solar/Batteries O.O#. #.O#. with
>>> /Software/Embedded Controllers) .OO#. .OO#.
>>rocks...1k
>>>
>>---------------------------------------------------------------------------
>>> Sent from my phone. Please excuse my brevity.
>>>
>>> Jeffrey Flint <jeffrey.flint at gmail.com> wrote:
>>>>I'm running package parallel in R-3.0.2.
>>>>
>>>>Below are the execution times using system.time for when executing
>>>>serially versus in parallel (with 2 cores) using parRapply.
>>>>
>>>>
>>>>Serially:
>>>> user system elapsed
>>>> 4.67 0.03 4.71
>>>>
>>>>
>>>>
>>>>Using package parallel:
>>>> user system elapsed
>>>> 3.82 0.12 6.50
>>>>
>>>>
>>>>
>>>>There is evident improvement in the user cpu time, but a big jump in
>>>>the elapsed time.
>>>>
>>>>In my code, I am executing a function on a 1000 row matrix 100 times,
>>>>with the data different each time of course.
>>>>
>>>>The initial call to makeCluster cost 1.25 seconds in elapsed time.
>>>>I'm not concerned about the makeCluster time since that is a fixed
>>>>cost. I am concerned about the additional 1.43 seconds in elapsed
>>>>time (6.50=1.43+1.25).
>>>>
>>>>I am wondering if there is a way to structure the code to avoid
>>>>largely avoid the 1.43 second overhead. For instance, perhaps I
>>could
>>>>upload the function to both cores manually in order to avoid the
>>>>function being uploaded at each of the 100 iterations? Also, I am
>>>>wondering if there is a way to avoid any copying that is occurring at
>>>>each of the 100 iterations?
>>>>
>>>>
>>>>Thank you.
>>>>
>>>>Jeff Flint
>>>>
>>>>______________________________________________
>>>>R-help at r-project.org mailing list
>>>>https://stat.ethz.ch/mailman/listinfo/r-help
>>>>PLEASE do read the posting guide
>>>>http://www.R-project.org/posting-guide.html
>>>>and provide commented, minimal, self-contained, reproducible code.
>>>
>
More information about the R-help
mailing list