[R] bam (mgcv) not using the specified number of cores

Simon Wood s.wood at bath.ac.uk
Mon Aug 25 15:16:53 CEST 2014


oops, I just realised that the fix referred to below is in mgcv 1.8-3 - 
not yet on CRAN.

On 25/08/14 13:20, Simon Wood wrote:
> Hi Andrew,
>
> In some of the shots you sent then top was reporting several cores
> working. I think the problem here may be to do with the way bam is
> parallelized - at present only part of the computation is in parallel -
> the model matrix QR decomposition part. The smoothing parameter
> selection is still single cored (although we are working on that), so if
> you watch top, you'll usually see multi-core and single core phases
> alternating with each other. The strategy works best in n>>p situations
> with few smoothing parameters.
>
> For the case where you used 31 cores, there was a bug in earlier mgcv
> versions in which it was assumed that when the model matrix is split
> into chunks for processing, each chunk would have more rows than
> columns. If you upgrade to the current mgcv version then this is fixed.
> However using 31 cores is liable to actually be less efficient than
> using fewer cores with the n to p (number of data to number of
> coefficients) ratio that you seem to have. This is because the work
> being done by each core is rather little, so that the overhead of
> stitching the cores' work back together becomes too high. Using
> 'use.chol=TRUE' would reduce the overheads here (although it uses a
> slightly less stable algorithm than the default).
>
> best,
> Simon
>
>
> On 22/08/14 06:13, Andrew Crane-Droesch wrote:
>> Hi Simon,
>>
>> (resending with all images as imgur so as to not bounce from list)
>>
>> Thanks for the reply.  I've tried to reproduce the error, but I don't
>> know how to show output from `top` any other way than with screenshots,
>> so please excuse that.
>>
>> Here are screenshots of what happens when I run with two
>> http://imgur.com/i26GKPo
>>
>> and three
>> http://imgur.com/8SL7scy
>>
>> cores.  In the former, it seems to be working on one core, and in the
>> latter, it appears to be working on three.  When reproducing the error,
>> I'm getting behavior that isn't entirely consistent -- sometimes it
>> "behaves" and operates on the asked-for number of cores, and other times
>> not.
>>
>> I'm also attaching a screenshot
>> http://imgur.com/bJfuS6R
>> showing terminal output from a remote cluster when I run my full model
>> (N=67K) rather than a subset (N=7K) -- I get that error "Error in
>> qr.qty(qrx, f) : right-hand side should have 60650 not 118451 rows ..."
>> I suppose this is a memory overload problem?  Any suggestions on how to
>> get bam to not call for more memory than the node has available would be
>> welcome, though I suspect that is a supercomputing problem rather than a
>> mgcv problem.  I don't know much about memory management, except that R
>> doesn't do it explicitly.
>>
>> Thanks,
>> Andrew
>>
>> sessionInfo() for local machine:
>> 1> sessionInfo()
>> R version 3.0.2 (2013-09-25)
>> Platform: x86_64-pc-linux-gnu (64-bit)
>>
>> locale:
>>   [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
>>   [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
>>   [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
>>   [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
>>   [9] LC_ADDRESS=C               LC_TELEPHONE=C
>> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>>
>> attached base packages:
>> [1] parallel  stats     graphics  grDevices utils datasets  methods
>> [8] base
>>
>> other attached packages:
>> [1] mgcv_1.7-26  nlme_3.1-111
>>
>> loaded via a namespace (and not attached):
>> [1] grid_3.0.2      lattice_0.20-23 Matrix_1.1-4
>> 1>
>> On 08/21/2014 04:29 PM, Simon Wood wrote:
>>> Hi Andrew,
>>>
>>> Could you provide a bit more information, please. In particular the
>>> results of sessionInfo() and the code that caused this weird behaviour
>>> (+ an indication of dataset size).
>>>
>>> best,
>>> Simon
>>>
>>> On 21/08/14 12:53, Andrew Crane-Droesch wrote:
>>>> I am getting strange behavior when trying to fit models to large
>>>> datasets using bam.  I am working on a 4-core machine, but I think that
>>>> there may be 2 physical cores that the computer uses as 4 cores in some
>>>> sense that I don't understand.
>>>>
>>>> When I run the bam using makeCluster(3), the model runs on one core.
>>>> But
>>>> when I run it on makeCluster(2), top shoes me that three of my cores
>>>> are
>>>> taken up to full capacity, and my computer slows down or crashes.
>>>>
>>>> How can I get it to truly run on 2 cores?
>>>>
>>>> I'm on a thinkpad X230, running ubuntu.
>>>>
>>>> Thanks,
>>>> Andrew
>>>>
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>>
>>
>
>


-- 
Simon Wood, Mathematical Science, University of Bath BA2 7AY UK
+44 (0)1225 386603               http://people.bath.ac.uk/sw283



More information about the R-help mailing list