`odds.n.ends`

was created in order to take the results from a binary logistic regression model estimated using the `glm()`

package and compute model significance, model fit, and the odds ratios and 95% confidence intervals typically reported from binary logistic regression analyses.

The small demonstration data set includes three variables. The first is a binary outcome variable (`sick`

) with two values, 1 and 0 where 1 represents sick and 0 represents not sick. The second is an integer representing age in years (`age`

) as one of the predictors, and a three-category nominal variable showing smoking status (`smoke`

).

```
# enter demo data
<- c(0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1,
sick 0, 1, 1, 0, 1, 1, 0, 1, 1, 0, 1, 0, 0, 1, 1, 0, 1, 1, 0, 0)
<- c(23, 25, 26, 34, 54, 46, 48, 95, 81, 42, 62, 25, 31, 49, 57, 52, 54, 63, 61, 50,
age 43, 35, 26, 74, 34, 46, 43, 65, 81, 42, 62, 25, 21, 47, 51, 22, 34, 59, 26, 55)
<- c('Former', 'Former', 'Former', 'Never', 'Current', 'Current', 'Current', 'Current', 'Never', 'Former', 'Never', 'Former', 'Current', 'Former', 'Never', 'Current', 'Current', 'Current', 'Former', 'Never','Former', 'Former', 'Former', 'Never', 'Current', 'Current', 'Current', 'Current', 'Never', 'Former', 'Never', 'Former', 'Current', 'Former', 'Never', 'Current', 'Current', 'Current', 'Former', 'Never')
smoke
# create data frame
<- data.frame(sick, age, smoke) smokeData
```

The `glm()`

function will be used to estimate a binary logistic regression model predicting the `sick`

outcome based on `age`

and `smoke`

.

```
# estimate the logistic regression model object
<- glm(formula = sick ~ age + smoke, data = smokeData, na.action = na.exclude, family = binomial(logit))
logisticModel
# print model summary for the logistic model object
summary(object = logisticModel)
```

```
##
## Call:
## glm(formula = sick ~ age + smoke, family = binomial(logit), data = smokeData,
## na.action = na.exclude)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -2.0490 -0.6251 0.3009 0.6955 1.9315
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -3.28649 1.58753 -2.070 0.0384 *
## age 0.10442 0.03711 2.814 0.0049 **
## smokeFormer -1.12544 0.94693 -1.189 0.2346
## smokeNever -2.47194 1.25103 -1.976 0.0482 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 54.548 on 39 degrees of freedom
## Residual deviance: 37.896 on 36 degrees of freedom
## AIC: 45.896
##
## Number of Fisher Scoring iterations: 5
```

The summary contains model coefficients, coefficient significance, and deviance and AIC which are measures of lack of fit of the model. While this information is useful in determining which of the predictors is significant and whether the deviance (lack of fit) was reduced between a null model with no predictors in it and an estimated model.

```
# open odds.n.ends package
library(package = "odds.n.ends")
# get the basics
odds.n.ends(mod = logisticModel)
```

`## Waiting for profiling to be done...`

```
## $`Logistic regression model significance`
## Chi-squared d.f. p
## 16.652 3.000 0.001
##
## $`Contingency tables (model fit): frequency predicted`
## Number observed
## Number predicted 1 0 Sum
## 1 19 4 23
## 0 4 13 17
## Sum 23 17 40
##
## $`Count R-squared (model fit): percent correctly predicted`
## [1] 80
##
## $`Model sensitivity`
## [1] 0.826087
##
## $`Model specificity`
## [1] 0.7647059
##
## $`Predictor odds ratios and 95% CI`
## OR 2.5 % 97.5 %
## (Intercept) 0.03738466 0.001102466 0.6610966
## age 1.11006273 1.041062741 1.2081565
## smokeFormer 0.32450861 0.045942281 2.0537937
## smokeNever 0.08442065 0.005379054 0.8158007
```

The results show that the model was statistically significantly better than a baseline model at explaining the outcome [\(\chi^2\)(3) = 16.652; p = .001]. The model correctly predicted 19 of those who were sick (`sick = 1`

) and 13 of those who were not sick (`sick = 0`

), for a total of 32 correctly predicted out of 40 (Count-\(R^2\) = .80 or 80% correctly predicted). The model was more sensitive, with 82.6% of those who were sick (the cases) correctly predicted, and less specific, with 76.5% of the members of the reference group correctly predicted. Age was a statistically significant predictor of the outcome; for every one year increase in age, the odds of being sick increased by 11% (OR = 1.11; 95% CI: 1.04 - 1.21). There was no statistically significant difference in odds of being sick for former smokers compared to current smokers. Never smokers had 92% lower odds of being sick compared to current smokers; this decrease was statistically significant (OR = .08; 95% CI: .005 - .82).

The `odds.n.ends`

package has several additional options including the ability to get an ROC curve (use option `rocPlot = TRUE`

) and histograms of predicted probabilities (use option `predProbPlot = TRUE`

). Colors for these plots can be set with options `color1 =`

and `color2 =`

. Finally, the threshold for a predicted probability being counted as a case (outcome = 1) has a default value of .5, so any predicted probability that is .5 or higher will be counted as a case, and any predicted probability below .5 will be counted as a reference group member (outcome = 0). This threshold can be adjusted using the `thresh =`

argument.