[R] Antwort: Re: Antwort: Buying more computer for GLM

Prof Brian Ripley ripley at stats.ox.ac.uk
Fri Sep 1 15:50:52 CEST 2006


On Fri, 1 Sep 2006, g.russell at eos-finance.com wrote:

> Prof Brian Ripley wrote:
> > I would not have expected glm to be more than say 5x slower than lm if 
> CPU 
> > cycles and not memory were the limiting factor.  In that case more RAM 
> > might be all you need.
> 
> The ratio between glm and lm might well be about 5x, but that's still a 
> big difference for us.   

You said lm was 'very fast', so I did not expect 5x 'very fast' to be 'too 
slow'.

> I am pretty sure that RAM is not the main 
> problem; according to the Windows Task Manager the computer is at close to 
> 100% CPU usage, and swapping is not going on.   Of course L1/L2 caches may 
> still be
> something one can work on, but I'm not sure whether glm has enough 
> repeated access to the same data for that to help.   (I don't know how glm 
> works,
> but I guess it does a lot of scans through the whole data set, and that 
> the amount of working memory it needs during these scans is basically a 
> function of the number of parameters, not the number of observations, is 
> that right?)

Not so.  Because glm does weighted fits, it needs to access the whole data 
matrix at each iteration (to re-weight).

> Many thanks for your observations about subset selection by the way, they 
> are a lot of help.   Would a good approach be, say, to use some stricter 
> criteria like BIC for choosing a model, and then use non-statistical 
> methods to improve the plausibility of the chosen parameters?

The latter entirely I would say.  All statistics can say is that a 
variable improves the fit measurably more than one that is unrelated to 
the response: whether it improves it enough to be worthwhile in your 
application is non-statistical. The point here is that all but the most 
uselss variables will measurably improve the fit in large problems with 
few variables.

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595



More information about the R-help mailing list