[R] Using Predict and GLM

Bill.Venables at csiro.au Bill.Venables at csiro.au
Tue Jan 29 05:32:50 CET 2008

Your problem is that your newdata data frame only specifies what the
value for A is in the predictions.  The values for W1 and W2 are
unspecified.  To predict from fitted models, you need to specify the
values you wish to use for *all* the predictor variables, not just the
one (I presume) is different from the estimation set.

Assuming that is what you want to do, here is a version of your script
that you may care to look at fairly closely.  I have omitted all the
unnecessary bits, (I could find), put the variables in a data frame
instead of leaving them strewn around the global environment, and done a
couple of predictions just to show it works.

n0 <- 200 # number of observations

myData <- data.frame(W1 = rnorm(n0, mean = 2, sd = 2),
  W2 = rnorm(n0, mean = 3, sd = 8))
myData <- transform(myData, # add A
  A = rbinom(n0, 1, ifelse(W1 > 1.5, 0.4, 0.2)))
myData <- transform(myData, # add Y
  Y = rbinom(n0, 1, 1/(1+exp(-(10*A-5*(W1)^2+2*W2)))))

Q <- glm(Y ~ A + W1 + W2, family = binomial, data = myData)

pData <- transform(myData, A = 0) # keep W1 and W2 as they were

QA <- predict(Q, newdata = pData) # linear predictors
QR <- predict(Q, newdata = pData, type = "response") # probs

plot(QA, QR)

Bill Venables. 

Bill Venables
CSIRO Laboratories
PO Box 120, Cleveland, 4163
Office Phone (email preferred): +61 7 3826 7251
Fax (if absolutely necessary):  +61 7 3826 7304
Mobile:                         +61 4 8819 4402
Home Phone:                     +61 7 3286 7700
mailto:Bill.Venables at csiro.au

-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]
On Behalf Of Sherri Rose
Sent: Tuesday, 29 January 2008 1:27 PM
To: r-help at r-project.org
Subject: [R] Using Predict and GLM

Dear R Help,

I read through the archives pretty extensively before sending this  
email, as it seemed there were several threads on using predict with  
GLM.  However, while my issue is similar to previous posts (cannot  
get it to predict using new data), none of the suggested fixes are  

The important bits of my code:

n0=200 #number of observations
W1=rnorm(n0,mean=2,sd=2) #Use rnorm to generate W1
W2=rnorm(n0,mean=3,sd=8) #Use rnorm to generate W1
Aprob=matrix(.2, nrow=n0, ncol=1) #generating the probability of A
#generating probability of A dependant on W1
for(i in 1:n0){
if (W1[i]>1.5) {Aprob[i]=0.4}
A=matrix(rbinom(n0, 1, Aprob), nrow=n0, ncol=1) #generating the 0/1  
Y=matrix(rbinom(n0, 1, Yprob), nrow=n0, ncol=1) #generating the 0/1  
zero=data.frame(rep(0, n0))

Q=glm(cbind(Y, 1-Y) ~ A + W1 + W2, family='binomial')
QA=predict(Q, newdata=as.data.frame(A))

I've tried many variations of the last line (Q0) to get the predicted  
values when A=0 with no luck.  With this code,  I get errors that my  
A=zero is a list even though I made it into a data frame.  This is  
the version of the code (after my reading) that *should* work for  
predict once I can get it to accept that it is not a list.

With other variants of the line that will run but are not  
syntactically correct, my QA and Q0 are the same.

Any guidance would be appreciated!


	[[alternative HTML version deleted]]

R-help at r-project.org mailing list
PLEASE do read the posting guide
and provide commented, minimal, self-contained, reproducible code.

More information about the R-help mailing list