# [R] Regression on stratified count data

Achim Zeileis Achim.Zeileis at uibk.ac.at
Wed Apr 24 13:37:10 CEST 2013

```On Wed, 24 Apr 2013, meng wrote:

> Hi all:
> For stratified count data,how to perform regression analysis?
>
> My data:
> age case oc count
> 1       1     1    21
> 1       1     2    26
> 1       2     1    17
> 1       2     2    59
> 2       1     1    18
> 2       1     2    88
> 2       2     1     7
> 2       2     2   95
>
> age:
> 1:<40y
> 2:>40y
>
> case:
> 1:patient
> 2:health
>
> oc:
> 1:use drug
> 2:not use drug
>
> My purpose:
> Anaysis whether case and oc are correlated, and age is a stratified variable.
>
> My solution:
> 1,Mantel-Haenszel test by using function "mantelhaen.test"
> 2,loglinear regression by using function glm(count~case*oc,family=poisson).But I don't know how to handle variable "age",which is the stratified variable.

Instead of using glm(family = poisson) it is also convenient to use
loglm() from package MASS for the associated convenience table.

The code below shows how to set up the contingency table, visualize it
using package vcd, and then fit two models using loglm. The models
considered are conditional independence of case and drug given age and the
"no three-way interaction" already suggested by Peter. Both models are
also accompanied by visualizations of the residuals.

## contingency table with nice labels
tab <- expand.grid(drug = 1:2, case = 1:2, age = 1:2)
tab\$count <- c(21, 26, 17, 59, 18, 88, 7, 95)
tab\$age <- factor(tab\$age, levels = 1:2, labels = c("<40", ">40"))
tab\$case <- factor(tab\$case, levels = 1:2, labels = c("patient",
"healthy"))
tab\$drug <- factor(tab\$drug, levels = 1:2, labels = c("yes", "no"))
tab <- xtabs(count ~ age + drug + case, data = tab)

## visualize case explained by drug given age
library("vcd")
mosaic(case ~ drug | age, data = tab,
split_vertical = c(TRUE, TRUE, FALSE))

## test wheter drug and case are independent given age
m1 <- loglm(~ age/(drug + case), data = tab)
m1

## visualize corresponding residuals from independence model
mosaic(case ~ drug | age, data = tab,
split_vertical = c(TRUE, TRUE, FALSE),
residuals_type = "deviance",
gp = shading_hcl, gp_args = list(interpolate = 1.2))
mosaic(case ~ drug | age, data = tab,
split_vertical = c(TRUE, TRUE, FALSE),
residuals_type = "pearson",
gp = shading_hcl, gp_args = list(interpolate = 1.2))

## test whether there is no three-way interaction
## (i.e., dependence of case on drug is the same for both age groups)
m2 <- loglm(~ (age + drug + case)^2, data = tab)
m2

## visualization (with default pearson residuals)
mosaic(case ~ drug | age, data = tab,
expected = ~ (age + drug + case)^2,
split_vertical = c(TRUE, TRUE, FALSE),
gp = shading_hcl, gp_args = list(interpolate = 1.2))

> Many thanks for your help.
>
> My best.
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

```

More information about the R-help mailing list