[R] Question about multiple regression

Prof Brian Ripley ripley at stats.ox.ac.uk
Mon Sep 8 18:51:59 CEST 2008


Are you sure R's ways are not fast enough (there are many layers 
underneath lm)?  For an example of how you might do this at C/Fortran 
level, see the function lqs() in MASS.

On Mon, 8 Sep 2008, Dimitri Liakhovitski wrote:

> Dear R-list,
> maybe some of you could point me in the right direction:
>
> Are you aware of any FREE Fortran or Java libraries/actual pieces of
> code that are VERY efficient (time-wise) in running the regular linear
> least-squares multiple regression?

A lot of the effort is in getting the right answer fast, including for 
e.g. collinear inputs.

> More specifically, I have to run small regression models (between 1
> and 15 predictors) on samples of up to N=700 but thousands and
> thousands of them.
>
> I am designing a simulation in R and running those regressions and R
> itself is way too slow. So, I am thinking of compiling the regression
> run itself in Fortran and Java and then calling it from R.

I think Java is unlikely to be fast compared to the Fortran R itself uses.

Have you profiled to find where the time is really being spent (both R and 
C/Fortran profiling if necessary).

>
> Thank you very much for any advice!
>
> Dimitri Liakhovitski
> MarketTools, Inc.
> Dimitri.Liakhovitski at markettools.com
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595



More information about the R-help mailing list