[R] big quasi-fixed effects OLS model

Joshua Wiley jwiley.psych at gmail.com
Wed May 9 06:34:01 CEST 2012


Hi Ivo,

You might check out biglm.  It is not clear to me how to parallelize a single model, but if you are running several, of course you can (but you already know that).  The one thing that may help is to link R against an optimized, multithreaded BLAS such as Atlas (I think you have to do this at compile time, but I could be gravely mistaken).

Another possibly very silly idea is that if you are running many models with different combinations of your variables (sort of a model selection type thing), rather than fitting the model every time, what about creating a dataset with all variables (including interactions) of interest, and calculation one huge covariances matrix and the means.  Then you just fit all your models based off the covariances matrix.  That could still be huge and maybe not anymore computationally efficient, but it would effectively reduce your working data from n x k to k x k (+ 1 x k for the vector of means if you care about those).

Josh

On May 8, 2012, at 20:30, ivo welch <ivowel at gmail.com> wrote:

> dear R experts---now I have a case where I want to estimate very large
> regression models with many fixed effects---not just the mean type, but
> cross-fixed effects---years, months, locations, firms.  Many millions of
> observations, a few thousand variables (most of these variables are
> interaction fixed effects).  could someone please point me to packages, if
> any, that would help me estimate such models?  (can these problems be split
> over many different cores?)
> 
> advice appreciated.
> 
> /iaw
> 
> ----
> Ivo Welch (ivo.welch at brown.edu, ivo.welch at gmail.com)
> CV Starr Professor of Economics (Finance), Brown University
> http://welch.econ.brown.edu/
> 
>    [[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list