[R] More compact form of lm object that can be used for prediction?
Charles C. Berry
cberry at tajo.ucsd.edu
Sat Jul 12 02:39:10 CEST 2008
On Fri, 11 Jul 2008, Woolner, Keith wrote:
>> From: Marc Schwartz [mailto:marc_schwartz at comcast.net]
>> Sent: Friday, July 11, 2008 12:14 PM
>>
>> on 07/11/2008 10:50 AM Woolner, Keith wrote:
>>> Hi everyone,
>>>
>>>
>>>
>>> Is there a way to take an lm() model and strip it to a minimal form
> (or
>>> convert it to another type of object) that can still used to predict
> the
>>> dependent variable?
>>
>> <snip>
>>
>> Depending upon how much memory you need to conserve and what else you
>> may need to do with the model object:
>>
>> 1. lm(YourFormula, data = YourData, model = FALSE)
>>
>> 'model = FALSE' will result in the model frame not being retained.
>>
>> 2. lm(YourFormula, data = YourData, model = FALSE, x = FALSE)
>>
>> 'x = FALSE' will result in the model matrix not being retained.
>>
>> See ?lm for more information.
>
> Marc,
>
> Thank you for the suggestions. Though I neglected to mention it, I had
> already consulted ?lm and was using model=FALSE. x=FALSE is the default
> setting and I had left it unchanged.
>
> The problem I still face is that the memory usage is dominated by the
> "qr" component of the model, consuming nearly 80% of the total
> footprint.
Do like this:
> res <- lm(Ozone~Month,airquality)
> res$qr <- qr( qr.R ( res$qr ) )
> all.equal(predict(res, newdata=airquality),
+ predict(lm(Ozone~Month,airquality), newdata=airquality))
[1] TRUE
For predictions only...
HTH,
Chuck
> Using model=FALSE and x=FALSE saves a little over 4% of
> model size, and if I deliberately clobber some other components, as
> shown below, I can get about boost that to about 20% savings while still
> being able to use predict().
>
> lm.1$fitted.values <- NULL
> lm.1$residuals <- NULL
> lm.1$weights <- NULL
> lm.1$effects <- NULL
>
> The lm() object after doing so is still around 52 megabytes
> (object.size(lm.1) = 51,611,888), with 99.98% of it being used by
> lm.1$qr. That was the motivation behind my original question, which was
> whether there's a way to get predictions from a model without keeping
> the "qr" component around. Especially since I want to create and use
> six of these models simultaneously.
>
> My hope is to save and deploy the models in a reporting system to
> generate predictions on a daily basis as new data comes in, while the
> model itself would change only infrequently. Hence, I am more concerned
> with being able to retain the predictive portion of the models in a
> concise format, and less concerned with keeping the supporting
> analytical detail around for this application.
>
> The answer may be that what I'm seeking to do isn't possible with the
> currently available R+packages, although I'd be mildly surprised if
> others haven't run into this situation before. I just wanted to make
> sure I wasn't missing something obvious.
>
> Many thanks,
> Keith
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
Charles C. Berry (858) 534-2098
Dept of Family/Preventive Medicine
E mailto:cberry at tajo.ucsd.edu UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901
More information about the R-help
mailing list