[R] Estimate of variance and prediction for multiple linear regression
Gavin Simpson
gavin.simpson at ucl.ac.uk
Thu Jun 24 09:16:16 CEST 2010
On Wed, 2010-06-23 at 16:30 -0700, cc super wrote:
> What if the size of the newdata is different from the previous one
> used to generate the regression model?
>
> Let's say
>
> pdat <- data.frame(x1 = rnorm(5, 2), x2 = rnorm(5))
> predict(lin, pdat)
>
> It comes up with warning and the result is not correct.
No it doesn't (not on my system at least) and you give no indication
what the warning/error is or why you think the result is incorrect.
set.seed(1234)
y <- rnorm(10, mean = 5)
dat <- data.frame(x1 = rnorm(10, mean = 2),
x2 = rnorm(10))
lin <- lm(y~x1+x2, data = dat)
pdat <- data.frame(x1 = rnorm(5, 2), x2 = rnorm(5))
predict(lin, pdat)
> predict(lin, pdat)
1 2 3 4 5
4.971021 6.482085 5.742130 4.652898 5.194980
The size of newdata shouldn't matter at all:
> pdat2 <- data.frame(x1 = rnorm(1, 2), x2 = rnorm(1))
> predict(lin, pdat2)
1
4.185984
> pdat3 <- data.frame(x1 = rnorm(1000, 2), x2 = rnorm(1000))
> head((p3 <- predict(lin, pdat3)))
1 2 3 4 5 6
4.076389 4.118338 4.054443 3.976399 4.055590 4.126018
> length(p3)
[1] 1000
So can you show the exact code you used plus the errors/warnings you
report and why you think the results are incorrect?
For example, automating your by-hand computation a bit shows that
predict() produces the correct result. Compare:
> ## The cbind(1, ....) bit is for the intercept...
> data.matrix(cbind(1, pdat)) %*% coef(lin)
[,1]
[1,] 4.178516
[2,] 4.101112
[3,] 4.081370
[4,] 4.146819
[5,] 4.186405
> predict(lin, pdat)
1 2 3 4 5
4.178516 4.101112 4.081370 4.146819 4.186405
HTH
G
>
> Thanks!
>
>
> 2010/6/23 Gavin Simpson <gavin.simpson at ucl.ac.uk>
> On Tue, 2010-06-22 at 23:11 -0700, cc super wrote:
> > Hi, everyone,
> >
> > Night. I have three questions about multiple linear
> regression in R.
> >
> > Q1:
> >
> > y=rnorm(10,mean=5)
> > x1=rnorm(10,mean=2)
> > x2=rnorm(10)
> > lin=lm(y~x1+x2)
> > summary(lin)
> >
> > ## In the summary, 'Residual standard error: 1.017 on 7
> degrees of freedom',
> > 1.017 is the estimate of the constance variance?
>
>
> Yes, it is sigma.
>
> Just a note, in order for the above code to yield the same
> results as
> you quote, you need a call to set.seed() to fix the pseudo
> random number
> generator.
>
> > Q2:
> >
> > beta0=lin$coefficients[1]
> > beta1=lin$coefficients[2]
> > beta2=lin$coefficients[3]
> >
> > y_hat=beta0+beta1*x1+beta2*x2
> >
> > ## Is there any built-in function in R to obtain y_hat
> directly?
>
>
> fitted(lin)
>
> Note that there are quite a few standard extractor functions
> like fitted
> available for modelling functions in R. coef() for example
> should be
> used to extract the coefficients, resid() will extract
> residuals etc.
>
> > Q3:
> >
> > If I want to apply this regression result to another
> dataset, that is, new
> > x1 and x2. Is the built-in function in 2 still available?
>
>
> It is called predict() (although if you called predict(lin)
> above
> instead of fitted(lin) it would have produced the same answer;
> the
> fitted values for the observations).
>
> One gotcha that catches people out is that in the new dataset,
> the
> variables (used in the model) must have the same names as the
> data frame
> used to fit it. So we could do:
>
> pdat <- data.frame(x1 = rnorm(10, 2), x2 = rnorm(10))
> predict(lin, pdat)
>
> to get predictions at the new values of x1 an x2.
>
> > Thank you in advance!
>
> HTH
>
> G
>
> --
> %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~
> %~%~%~%
> Dr. Gavin Simpson [t] +44 (0)20 7679 0522
> ECRC, UCL Geography, [f] +44 (0)20 7679 0565
> Pearson Building, [e]
> gavin.simpsonATNOSPAMucl.ac.uk
> Gower Street, London [w]
> http://www.ucl.ac.uk/~ucfagls/
> UK. WC1E 6BT. [w]
> http://www.freshwaters.org.uk
> %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~
> %~%~%~%
>
>
--
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
Dr. Gavin Simpson [t] +44 (0)20 7679 0522
ECRC, UCL Geography, [f] +44 (0)20 7679 0565
Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk
Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/
UK. WC1E 6BT. [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
More information about the R-help
mailing list