[R] More compact form of lm object that can be used for prediction?

Sat Jul 12 02:39:10 CEST 2008

On Fri, 11 Jul 2008, Woolner, Keith wrote:

>> From: Marc Schwartz [mailto:marc_schwartz at comcast.net]
>> Sent: Friday, July 11, 2008 12:14 PM
>>
>> on 07/11/2008 10:50 AM Woolner, Keith wrote:
>>> Hi everyone,
>>>
>>>
>>>
>>> Is there a way to take an lm() model and strip it to a minimal form
> (or
>>> convert it to another type of object) that can still used to predict
> the
>>> dependent variable?
>>
>> <snip>
>>
>> Depending upon how much memory you need to conserve and what else you
>> may need to do with the model object:
>>
>> 1. lm(YourFormula, data = YourData, model = FALSE)
>>
>> 'model = FALSE' will result in the model frame not being retained.
>>
>> 2. lm(YourFormula, data = YourData, model = FALSE, x = FALSE)
>>
>> 'x = FALSE' will result in the model matrix not being retained.
>>
>> See ?lm for more information.
>
> Marc,
>
> Thank you for the suggestions.  Though I neglected to mention it, I had
> already consulted ?lm and was using model=FALSE.  x=FALSE is the default
> setting and I had left it unchanged.
>
> The problem I still face is that the memory usage is dominated by the
> "qr" component of the model, consuming nearly 80% of the total
> footprint.

Do like this:

> res <- lm(Ozone~Month,airquality)
> res$qr <- qr( qr.R ( res$qr ) )
> all.equal(predict(res, newdata=airquality),
+    predict(lm(Ozone~Month,airquality), newdata=airquality))
[1] TRUE

For predictions only...

HTH,

Chuck

> Using model=FALSE and x=FALSE saves a little over 4% of
> model size, and if I deliberately clobber some other components, as
> shown below, I can get about boost that to about 20% savings while still
> being able to use predict().
>
> 	lm.1$fitted.values <- NULL
> 	lm.1$residuals     <- NULL
> 	lm.1$weights       <- NULL
> 	lm.1$effects       <- NULL
>
> The lm() object after doing so is still around 52 megabytes
> (object.size(lm.1) = 51,611,888), with 99.98% of it being used by
> lm.1$qr.  That was the motivation behind my original question, which was
> whether there's a way to get predictions from a model without keeping
> the "qr" component around.  Especially since I want to create and use
> six of these models simultaneously.
>
> My hope is to save and deploy the models in a reporting system to
> generate predictions on a daily basis as new data comes in, while the
> model itself would change only infrequently.  Hence, I am more concerned
> with being able to retain the predictive portion of the models in a
> concise format, and less concerned with keeping the supporting
> analytical detail around for this application.
>
> The answer may be that what I'm seeking to do isn't possible with the
> currently available R+packages, although I'd be mildly surprised if
> others haven't run into this situation before.  I just wanted to make
> sure I wasn't missing something obvious.
>
> Many thanks,
> Keith
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

Charles C. Berry                            (858) 534-2098
                                             Dept of Family/Preventive Medicine
E mailto:cberry at tajo.ucsd.edu	            UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901