[R] Antwort: Buying more computer for GLM
Charles C. Berry
cberry at tajo.ucsd.edu
Thu Aug 31 18:13:10 CEST 2006
George,
Logistic regression with ONLY factors?
In principle this can be solved by casting this as a log-linear model of
counts and using iterative proportional fitting.
For sparse data like yours (i.e. a table with 20000 counts and >= 2^31
cells), it will be necessary to use a method that does not explicitly
operate on the table of counts as loglin() does. I would guess that
rake() in the survey package would handle this, but I've not looked at
the code it uses.
If you are only using a fraction of the factors then loglm() (in MASS) or
loglin() may suffice.
HTH,
Chuck
On Wed, 30 Aug 2006, g.russell at eos-finance.com wrote:
> Hello,
>
> at the moment I am doing quite a lot of regression, especially
> logistic regression, on 20000 or more records with 30 or more
> factors, using the "step" function to search for the model with the
> smallest AIC. This takes a lot of time on this 1.8 GHZ Pentium
> box. Memory does not seem to be such a big problem; not much
> swapping is going on and CPU usage is at or close to 100%. What
> would be the most cost-effective way to speed this up? The
> obvious way would be to get a machine with a faster processor (3GHz
> plus) but I wonder whether it might instead be better to run a dual-
> processor machine or something like that; this looks at least like a
> problem R should be able to parallelise, though I don't know whether it
> does.
>
> Thanks for your help,
>
> George Russell
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
Charles C. Berry (858) 534-2098
Dept of Family/Preventive Medicine
E mailto:cberry at tajo.ucsd.edu UC San Diego
http://biostat.ucsd.edu/~cberry/ La Jolla, San Diego 92093-0717
More information about the R-help
mailing list