[R] Memory Needed for Regression

Charles C. Berry cberry at tajo.ucsd.edu
Tue Jan 11 03:55:13 CET 2011


On Mon, 10 Jan 2011, efreeman wrote:

> I'm looking for a formula for memory usage in standard regression; that
> is, if I have X rows with Y predictors, how much memory is needed? I'm
> speccing out a system, and I'd like to be able to get enough memory
> that we can do some fairly large regressions.
>

 	install.packages("biglm")
 	require(biglm)

Then see

 	?biglm

"biglm creates a linear model object that uses only p^2 memory for p 
variables. It can be updated with more data using update. This allows 
linear regression on data sets larger than memory."


If you want to get serious about this look in Golub and Van Loan* (Sorry, 
my copy is not at hand so I cannot be more specific. Maybe there is a 
section like "Updating Matrix Factorizations" that says what is needed.)

Also, see

Algorithm AS274 Applied Statistics (1992) Vol.41, No. 2

which is what biglm() refers to. And maybe read the source code of 
biglm() if you are planning on using that package.

HTH,

Chuck

* @book{golub1996matrix,
   title={{Matrix computations}},
   author={Golub, G.H. and Van Loan, C.F.},
   isbn={0801854148},
   year={1996},
   publisher={Johns Hopkins Univ Pr}
}



> ==Ed Freeman
>
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

Charles C. Berry                            Dept of Family/Preventive Medicine
cberry at tajo.ucsd.edu			    UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901



More information about the R-help mailing list