[R] glm: getting the confidence interval for an Odds Ratio, when using predict()

Mon Mar 19 23:11:13 CET 2012

On Mar 19, 2012, at 22:32 , Dominic Comtois wrote:

> Thanks for your answer, much appreciated.
> 
> This ain't trivial indeed. I worked my way through it, until I got a "non conformable arguments" error when trying to calculate the new standard error. Since I'm not following 100% what's happening, it's hard for me to figure out what I should do next.
> 
> Here's an example with simulated data: 
> 

Had helped if you had at least got it working to the same level as you had from the outset... 

> x1 <- rbinom(100,1,.5)
> x2 <- factor(round(runif(100,1,5)),labels=c("cat1","cat2","cat3","cat4","cat5"))
> outcome <- rbinom(100,1,.2)
> 
> model <- glm(outcome~x1+x2,family=binomial(logit))
> 

Needs to be model1, I suppose

> newd <- data.frame(x1=factor(c(0,1)),x2=factor(c("cat1","cat2")),outcome=c(0,0))

Two problems here: x1 is not a factor in your model, and x2 has a different level set

> M <- model.matrix(model1, data=newd)

There's a problem with my own suggestion: Apparently this construct drops unused levels, even if the level set in newd is right. This seems to do the trick for me:

M <- model.matrix(formula(model1), data=newd)

> V <- vcov(model1)
> contr <- c(-1,1) %*% M
> se <- contr %*% V %*% contr

Need t(contr) on the rightmost term 

> OR.ci <- exp(pred2 - pred1 + qnorm(c(.025,.50,.975))*se)

This would be OK, had you actually computed pred1 and pred2. Might substitute

OR.ci <- exp(contr %*% coef(model1) + qnorm(c(.025,.50,.975))*se)

> 
> Thanks for any additional hints,
> 
> Dominic C.
> 
> 
> 2012/3/19 peter dalgaard <pdalgd at gmail.com>
> 
> There's no trivial way since you need the covariance of pred2 and pred1 to calculate the variance of the difference.
> 
> I think you can proceed somewhat like as follows (I can't be bothered to test it without a reproducible example to start from. You may need to throw in a few explicit t() and as.vector() here and there.)
> 
> newd   <- data.frame(age.cat=c(1,2),male=c(1,0),lowed=c(1,0))
> M <- model.matrix(model, data=newd)
> V <- vcov(model)
> contr <- c(-1,1) %*% M
> se <- contr %*% V %*% contr
> 
> OR.ci <- exp(pred2 - pred1 + qnorm(c(.025,.50,.975))*se)
> 
> (Sanity check:  contr %*% coef(model)  should be same as  pred2 - pred1 )
> 
> I'm not sure how general the model.matrix trick is. It works in cases like
> 
> > mm <- glm(ff, data=trees)
> > model.matrix(mm, data=trees[1,])
>  (Intercept) log(Height) log(Girth)
> 1           1    4.248495   2.116256
> attr(,"assign")
> [1] 0 1 2
> 
> but I see that there are cases where a "data" argument may be ignored. If that is the case, then you may have to construct the "contr" vector by hand.
> 
> --
> Peter Dalgaard, Professor
> Center for Statistics, Copenhagen Business School
> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
> Phone: (+45)38153501
> Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com
> 
> 

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com