[R] predict.lm How to introduce new data?

Wed Mar 23 15:26:01 CET 2011

On Mar 23, 2011, at 10:05 AM, agent dunham wrote:

> Dear all,
>
> I've fitted a lm using 61 data (training data), and I'left 10 as  
> test data.
>
> Training data and test data are stored in an excell.
>
> training <- read.xls("C:/...../training.xls") , the same for test.  
> That is:
> v1
> v2
> ...
> v15
>
> When I type str(training) and str(test), both sets have the same names
>
> The resulting model is lms <- lm(vd ~ log(v1) + fv2+ fv5+ fv7 )  - 
> fvi means
> they were turned into factors-
>
> plms<- predict(lms, new=test ,interval="prediction")
>
> Error at model.frame.default(Terms, newdata, na.action = na.action,  
> xlev =
> object$xlevels) :
>  length of the variables are different (found for 'fv2')
>  More: Warning messages lost
> 'newdata' had 10 rows but variable(s) found have 61 rows
>
> q1: What does it mean?

In the absence of a reproducible example it is difficult to say.

> q2: Do I have to change test data names, so they have the same as the
> resulting lm?

At a minimum, that would be required. Read help(predict.lm) and pay  
particular attention to _all_ the details mentioned in the newdata  
section.

> q3: Do I have to do anything special because of the log  
> transformation?

No.

> q4: Afterwards I'd like to plot it, is this way?: plot(plms)

Specify what kind of plot. You are likely to be surprised and perhaps  
disappointed if you use the default plot method for lm objects.

As the standard postscript says:
> and provide commented, minimal, self-contained, reproducible code.

-- 

David Winsemius, MD
West Hartford, CT