[R] glm predict on new data
Brian Diggs
diggsb at ohsu.edu
Thu Apr 7 00:28:01 CEST 2011
On 4/6/2011 2:17 PM, dirknbr wrote:
> I am aware this has been asked before but I could not find a resolution.
>
> I am doing a logit
>
> lg<- glm(y[1:200] ~ x[1:200,1],family=binomial)
glm (and most modeling functions) are designed to work with data frames,
not raw vectors.
> Then I want to predict a new set
>
> pred<- predict(lg,x[201:250,1],type="response")
>
> But I get varying error messages or warnings about the different number of
> rows. I have tried data/newdata and also to wrap in data.frame() but cannot
> get to work.
I'll made up some data, show the way you approached it, show where it
went wrong, and then how it works more easily.
# data like what I think you had:
y <- rbinom(200, 1, prob=.8)
x <- data.frame(x=rnorm(250))
# your glm call:
lg <- glm(y[1:200]~x[1:200,1],family=binomial)
# take a look at print(lg). Notice that your independent variable
# name is "x[1:200, 1]", which is what you would need to match in
# a call to predict.
# Make data.frames of the given and testing data.
DF <- data.frame(y=y, x=x[1:200,1])
DF.new <- data.frame(x=x[200:250,1])
# Notice DF.new has the same name (x) as DF.
lg <- glm(y~x, data=DF, family=binomial)
pred <- predict(lg, newdata=DF.new, type="response")
summary(pred)
> Help would be appreciated.
>
> Dirk.
--
Brian S. Diggs, PhD
Senior Research Associate, Department of Surgery
Oregon Health & Science University
More information about the R-help
mailing list