[R] Odds Ratio and Logistic Regression

Lorenzo Isella lorenzo.isella at gmail.com
Sun Dec 30 19:14:55 CET 2012


Dear All,
I am learning the ropes about logistic regression in R.
I found some interesting examples

http://bit.ly/Vq4GgX
http://bit.ly/W9fUTg
http://bit.ly/UfK73e

but I am a bit lost.
I have several questions.
1) For instance, what is the difference between

glm.out = glm(response ~ poverty + gender, family=binomial(logit),
   data=mydata)

and

glm.out = glm(response ~ poverty * gender, family=binomial(logit),
   data=mydata)
? Which begs the question when I should use the "*" or "+" sign when doing  
a logistic regression on several explanatory variables. I think that in  
the former case I am allowing for an interaction between poverty and  
gender, but I would like to be sure about it.

2) Consider the following snippet


glm.out = glm(response ~ poverty + gender, family=binomial(logit),
   data=mydata)

where "response" is a dichotomous variable, poverty assumes only two  
values (Above poverty line and Below poverty line) and gender assumes only  
the Male or Female values.
The command above leads to the following output
#######################################
print(summary(glm.out))
Call:
glm(formula = response ~ poverty + gender, family = binomial(logit),
     data = mydata)

Deviance Residuals:
     Min       1Q   Median       3Q      Max
-2.2094   0.4269   0.4269   0.8033   1.1911

Coefficients:
                           Estimate Std. Error z value Pr(>|z|)
(Intercept)                 0.9656     0.1477   6.538 6.25e-11 ***
povertyBelow poverty line  -0.9978     0.3246  -3.074  0.00211 **
genderFEMALE                1.3840     0.2549   5.429 5.68e-08 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

     Null deviance: 494.81  on 499  degrees of freedom
Residual deviance: 457.13  on 497  degrees of freedom
AIC: 463.13

Number of Fisher Scoring iterations: 4
##############################################

To calculate then odds ratios, I should do the following

exp(coef(glm.out))
               (Intercept) povertyBelow poverty line               
genderFEMALE
                 2.6263831                 0.3687033                  
3.9909627

but here I am lost about the interpretation. For instance, what are the  
odds of a positive response for those above versus below the poverty line  
in males? In females?

I think that everything is there, but I cannot extract/interpret the info  
provided by R correctly.
Any help is appreciated.
Cheers

Lorenzo



More information about the R-help mailing list