[R] predict.lm(...,type="terms") question
John Thaden
jjthaden at flash.net
Sun Sep 2 19:07:01 CEST 2012
Thank you all. My muddle about predict.lm(..., type = "terms") was evident even in my first sentence of my original posting
> How can I actually use the output of
> predict.lm(..., type="terms") to predict
> new term values from new response values?
the answer being that I cannot; that new response values, if included in newdata, will simply be ignored by predict.lm, as well they should.
As for the calibration issue, I am reviewing literature now as suggested.
Though predict.lm performed to spec (no bug), may I suggest a minor
change to ?predict.lm text?
Existing:
newdata An optional data frame in which to
look for variables with
which to predict. If omitted,
the fitted values are used.
Proposed:
newdata An optional data frame in which to
look for new values of terms with
which to predict. If omitted, the
fitted values are used.
-John Thaden, Ph.D.
College Station, TX
--- On Sun, 9/2/12, peter dalgaard <pdalgd at gmail.com> wrote:
> From: peter dalgaard <pdalgd at gmail.com>
> Subject: Re: [R] predict.lm(...,type="terms") question
> To: "David Winsemius" <dwinsemius at comcast.net>
> Cc: "Rui Barradas" <ruipbarradas at sapo.pt>, r-help at r-project.org, "jjthaden" <jjthaden at flash.net>
> Date: Sunday, September 2, 2012, 1:35 AM
>
> On Sep 2, 2012, at 03:38 , David Winsemius wrote:
>
> >
> > Why should predict not complain when it is offered a
> newdata argument that does no contain a vector of values for
> "x"? The whole point of the terms method of prediction is to
> offer estimates for specific values of items on the RHS of
> the formula. The OP seems to have trouble understanding that
> point. Putting in a vector with the name of the LHS item
> makes no sense to me. I certainly cannot see that any
> particular behavior for this pathological input is described
> for predict.lm in its help page, but throwing an error seems
> perfectly reasonable to me.
>
> Yes. Lots of confusion going on here.
>
> First, data= is _always_ used as the _first_ place to look
> for variables, if things are not in it, search continues
> into the formula's environment. To be slightly perverse,
> notice that even this works:
>
> > y <- rnorm(10)
> > x <- rnorm(10)
> > d <- data.frame(z=rnorm(9))
> > lm(y ~ x, d)
>
> Call:
> lm(formula = y ~ x, data = d)
>
> Coefficients:
> (Intercept) x
>
> -0.2760
> 0.2328
>
> Secondly, what is predict(..., type="terms") supposed to
> have to do with inverting a regression equation? That's just
> not what it does, it only splits the prediction formula into
> its constituent terms.
>
> Thirdly; no, you do not invert a regression equation by
> regressing y on x. That only works if you can be sure that
> your new (x, y) are sampled from the same population as the
> data, which is not going to be the case if you are fitting
> to data with, say, selected equispaced x values. There's a
> whole literature on how to do this properly, Google e.g.
> "inverse calibration" for enlightenment.
>
> --
> Peter Dalgaard, Professor,
> Center for Statistics, Copenhagen Business School
> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
> Phone: (+45)38153501
> Email: pd.mes at cbs.dk
> Priv: PDalgd at gmail.com
>
>
>
>
>
>
>
>
>
More information about the R-help
mailing list