geeCRT package

Hengshi Yu, Fan Li, Paul Rathouz, Elizabeth L. Turner, John Preisser

2024-02-18

Overview

geeCRT is an R package for implementing the bias-corrected generalized estimating equations in analyzing cluster randomized trials.

Population-averaged models have been increasingly used in the design and analysis of cluster randomized trials (CRTs). To facilitate the applications of population-averaged models in CRTs, we implement the generalized estimating equations (GEE) and matrix-adjusted estimating equations (MAEE) approaches to jointly estimate the marginal mean models correlation models both for general CRTs and stepped wedge CRTs.

Despite the general GEE/MAEE approach, we also implement a fast cluster-period GEE method specifically for stepped wedge CRTs with large and variable cluster-period sizes. The individual-level GEE/MAEE approach becomes computationally infeasible in this setting due to inversion of high-dimensional covariance matrices and the enumeration of a high-dimensional design matrix for the correlation estimation equations. The package gives a simple and efficient estimating equations approach based on the cluster-period means to estimate the intervention effects as well as correlation parameters.

In addition, the package also provides functions for generating correlated binary data with specific mean vector and correlation matrix based on the multivariate probit method (Emrich and Piedmonte 1991) or the conditional linear family method (Qaqish 2003). These two functions facilitate generating correlated binary data in future simulation studies.

geemaee() examples: matrix-adjusted GEE for estimating the mean and correlation parameters in SW-CRTs

The geemaee() function implements the matrix-adjusted GEE or regular GEE developed for analyzing cluster randomized trials (CRTs). It provides valid estimation and inference for the treatment effect and intraclass correlation parameters within the population-averaged modeling framework. The program allows for flexible marginal mean model specifications. The program also offers bias-corrected intraclass correlation coefficient (ICC) estimates as well as bias-corrected sandwich variances for both the treatment effect parameter and the ICC parameters. The technical details of the matrix-adjusted GEE approach are provided in (Preisser, Lu, and Qaqish 2008) and (Li, Turner, and Preisser 2018).

For the individual-level data, we use the geemaee() function to estimate the marginal mean and correlation parameters in CRTs. We use two simulated stepped wedge CRT datasets with true nested exchangeable correlation structure to illustrate the geemaee() function examples.

Simulated dataset with 12 clusters and 4 periods

period1 period2 period3 period4 treatment id period y_bin y_con
1 0 0 0 0 1 1 0 -0.5728721
1 0 0 0 0 1 1 0 1.0462372
1 0 0 0 0 1 1 1 0.6033255
1 0 0 0 0 1 1 0 -0.8680703
1 0 0 0 0 1 1 0 -1.8845704
0 1 0 0 1 1 2 0 -0.5572751

We first create an auxiliary function createzCrossSec() to help create the design matrix for the estimating equations of the correlation parameters assuming that the correlations follow a nested exchangeable structure. We then collect design matrix X for the mean parameters with five period indicators and the treatment indicator.

# Z matrix for nested exchangeable correlation structure
createzCrossSec <- function(m) {
  Z <- NULL
  n <- dim(m)[1]

  for (i in 1:n) {
    alpha_0 <- 1
    alpha_1 <- 2
    n_i <- c(m[i, ])
    n_length <- length(n_i)
    POS <- matrix(alpha_1, sum(n_i), sum(n_i))
    loc1 <- 0
    loc2 <- 0

    for (s in 1:n_length) {
      n_t <- n_i[s]
      loc1 <- loc2 + 1
      loc2 <- loc1 + n_t - 1

      for (k in loc1:loc2) {
        for (j in loc1:loc2) {
          if (k != j) {
            POS[k, j] <- alpha_0
          } else {
            POS[k, j] <- 0
          }
        }
      }
    }

    zrow <- diag(2)
    z_c <- NULL

    for (j in 1:(sum(n_i) - 1)) {
      for (k in (j + 1):sum(n_i)) {
        z_c <- rbind(z_c, zrow[POS[j, k], ])
      }
    }

    Z <- rbind(Z, z_c)
  }

  return(Z)
}

We implement the geemaee() function on continuous, binary and count outcomes using the nested exchangeable correlation structure, and consider both matrix-adjusted estimating equations (MAEE) with alpadj = TRUE and uncorrected generalized estimating equations (GEE) with alpadj = FALSE. We use the binary outcome for the count outcome type of the geemaee() function, and consider both poisson and quasipoisson distributions for the count outcome. For the shrink argument, we use the "ALPHA" method to tune step sizes and focus on using estimated variances in the correlation estimating equations rather than using unit variances by specifying makevone = FALSE.

########################################################################
### Example 1): simulated SW-CRT with small cluster-period sizes (5~10)
########################################################################
sampleSWCRT <- sampleSWCRTSmall

### Individual-level id, period, outcome, and design matrix
id <- sampleSWCRT$id
period <- sampleSWCRT$period
X <- as.matrix(sampleSWCRT[, c("period1", "period2", "period3", "period4", "treatment")])
m <- as.matrix(table(id, period))
n <- dim(m)[1]
t <- dim(m)[2]

### design matrix for correlation parameters
Z <- createzCrossSec(m)

### (1) Matrix-adjusted estimating equations and GEE
### on continuous outcome with nested exchangeable correlation structure

### MAEE
est_maee_ind_con <- geemaee(
  y = sampleSWCRT$y_con,
  X = X, id = id, Z = Z,
  family = "continuous",
  maxiter = 500, epsilon = 0.001,
  printrange = TRUE, alpadj = TRUE,
  shrink = "ALPHA", makevone = FALSE
)

### GEE
est_uee_ind_con <- geemaee(
  y = sampleSWCRT$y_con,
  X = X, id = id, Z = Z,
  family = "continuous",
  maxiter = 500, epsilon = 0.001,
  printrange = TRUE, alpadj = FALSE,
  shrink = "ALPHA", makevone = FALSE
)

### (2) Matrix-adjusted estimating equations and GEE
### on binary outcome with nested exchangeable correlation structure

### MAEE
est_maee_ind_bin <- geemaee(
  y = sampleSWCRT$y_bin,
  X = X, id = id, Z = Z,
  family = "binomial",
  maxiter = 500, epsilon = 0.001,
  printrange = TRUE, alpadj = TRUE,
  shrink = "ALPHA", makevone = FALSE
)

### GEE
est_uee_ind_bin <- geemaee(
  y = sampleSWCRT$y_bin,
  X = X, id = id, Z = Z,
  family = "binomial",
  maxiter = 500, epsilon = 0.001,
  printrange = TRUE, alpadj = FALSE,
  shrink = "ALPHA", makevone = FALSE
)


### (3) Matrix-adjusted estimating equations and GEE
### on count outcome with nested exchangeable correlation structure
### using Poisson distribution

### MAEE
est_maee_ind_cnt_poisson <- geemaee(
  y = sampleSWCRT$y_bin,
  X = X, id = id, Z = Z,
  family = "poisson",
  maxiter = 500, epsilon = 0.001,
  printrange = TRUE, alpadj = TRUE,
  shrink = "ALPHA", makevone = FALSE
)

### GEE
est_uee_ind_cnt_poisson <- geemaee(
  y = sampleSWCRT$y_bin,
  X = X, id = id, Z = Z,
  family = "poisson",
  maxiter = 500, epsilon = 0.001,
  printrange = TRUE, alpadj = FALSE,
  shrink = "ALPHA", makevone = FALSE
)


### (4) Matrix-adjusted estimating equations and GEE
### on count outcome with nested exchangeable correlation structure
### using Quasi-Poisson distribution

### MAEE
est_maee_ind_cnt_quasipoisson <- geemaee(
  y = sampleSWCRT$y_bin,
  X = X, id = id, Z = Z,
  family = "quasipoisson",
  maxiter = 500, epsilon = 0.001,
  printrange = TRUE, alpadj = TRUE,
  shrink = "ALPHA", makevone = FALSE
)

### GEE
est_uee_ind_cnt_quasipoisson <- geemaee(
  y = sampleSWCRT$y_bin,
  X = X, id = id, Z = Z,
  family = "quasipoisson",
  maxiter = 500, epsilon = 0.001,
  printrange = TRUE, alpadj = FALSE,
  shrink = "ALPHA", makevone = FALSE
)

Then we have the following output for continuous and binary outcomes:

# MAEE for continuous outcome
print(est_maee_ind_con)
## GEE for correlated data 
##  Number of Iterations: 6 
## Results for marginal mean parameters 
##      Beta   Estimate MB-stderr BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]    0  0.1563400 0.1233724 0.11343910  0.1185200  0.1238356  0.1185676
## [2,]    1  0.2598417 0.1396584 0.15564306  0.1665805  0.1783248  0.1655108
## [3,]    2  0.2750434 0.1744216 0.09769357  0.1056224  0.1143643  0.1037000
## [4,]    3  0.3265847 0.2269849 0.20527795  0.2202021  0.2363784  0.2191884
## [5,]    4 -0.1846654 0.1916254 0.14636555  0.1592412  0.1733169  0.1587060
## 
## Results for correlation parameters 
##      Alpha   Estimate BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]     0 0.07335831  0.0385460 0.04035981 0.04226387 0.04042324
## [2,]     1 0.03210581  0.0160329 0.01683282 0.01767569 0.01684501
# GEE for continuous outcome
print(est_uee_ind_con)
## GEE for correlated data 
##  Number of Iterations: 4 
## Results for marginal mean parameters 
##      Beta   Estimate MB-stderr BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]    0  0.1558990 0.1174052 0.11335326  0.1184565  0.1237977  0.1185069
## [2,]    1  0.2600341 0.1331291 0.15660456  0.1676591  0.1795345  0.1666850
## [3,]    2  0.2744039 0.1658703 0.09732366  0.1052335  0.1139553  0.1035272
## [4,]    3  0.3294253 0.2163095 0.20723038  0.2223924  0.2388282  0.2219057
## [5,]    4 -0.1857484 0.1828737 0.14726062  0.1602631  0.1744810  0.1599123
## 
## Results for correlation parameters 
##      Alpha   Estimate BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]     0 0.05304519 0.03055184 0.03200002 0.03352079 0.03204680
## [2,]     1 0.03019984 0.01459571 0.01532844 0.01610073 0.01533637
# MAEE for binary outcome
print(est_maee_ind_bin)
## GEE for correlated data 
##  Number of Iterations: 7 
## Results for marginal mean parameters 
##      Beta   Estimate MB-stderr BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]    0 -0.7731496 0.2861811  0.2397046  0.2507701  0.2623524  0.2528516
## [2,]    1 -0.5067925 0.3163229  0.3125495  0.3364462  0.3626966  0.3340832
## [3,]    2 -0.6159656 0.4107850  0.5793391  0.6398733  0.7076154  0.6341025
## [4,]    3 -0.5932682 0.5956314  0.5498687  0.6043812  0.6650312  0.6087172
## [5,]    4 -0.9959081 0.4904630  0.5373051  0.5918909  0.6525862  0.5889413
## 
## Results for correlation parameters 
##      Alpha   Estimate BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]     0 0.09615638 0.03212815 0.03375024 0.03546144 0.03373575
## [2,]     1 0.03369215 0.03077853 0.03223000 0.03375486 0.03224127
# GEE for binary outcome
print(est_uee_ind_bin)
## GEE for correlated data 
##  Number of Iterations: 4 
## Results for marginal mean parameters 
##      Beta   Estimate MB-stderr BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]    0 -0.7716253 0.2693959  0.2404336  0.2516128  0.2633193  0.2537330
## [2,]    1 -0.5040824 0.2983658  0.3111895  0.3349180  0.3609889  0.3328046
## [3,]    2 -0.6329007 0.3872258  0.5796856  0.6406367  0.7089294  0.6350176
## [4,]    3 -0.5905097 0.5625779  0.5551809  0.6102428  0.6715582  0.6149354
## [5,]    4 -0.9942493 0.4649278  0.5404357  0.5954360  0.6566581  0.5926395
## 
## Results for correlation parameters 
##      Alpha   Estimate BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]     0 0.06909291 0.02771682 0.02909958 0.03055802 0.02908517
## [2,]     1 0.03017185 0.02749985 0.02879600 0.03015777 0.02880640

In addition, we have the following output for count outcomes using Poisson and Quasi-Poisson distributions:

# MAEE for count outcome using Poisson distribution
print(est_maee_ind_cnt_poisson)
## GEE for correlated data 
##  Number of Iterations: 6 
## Results for marginal mean parameters 
##      Beta   Estimate MB-stderr BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]    0 -1.1521619 0.3489214  0.1673871  0.1753755  0.1837532  0.1758285
## [2,]    1 -0.9604021 0.3518871  0.1762732  0.1890095  0.2028242  0.1881040
## [3,]    2 -1.1052655 0.5223114  0.4286854  0.4823239  0.5434271  0.4806275
## [4,]    3 -1.0109748 1.0115124  0.4402988  0.4845222  0.5343777  0.4839782
## [5,]    4 -0.7592537 0.8089485  0.4209105  0.4648193  0.5143975  0.4635516
## 
## Results for correlation parameters 
##      Alpha   Estimate BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]     0 0.06817046 0.02179020 0.02284202 0.02395004 0.02281778
## [2,]     1 0.02099757 0.02218181 0.02321486 0.02429979 0.02321545
# GEE for count outcome using Poisson distribution
print(est_uee_ind_cnt_poisson)
## GEE for correlated data 
##  Number of Iterations: 5 
## Results for marginal mean parameters 
##      Beta   Estimate MB-stderr BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]    0 -1.1520612 0.3422800  0.1677249  0.1757617  0.1841923  0.1761798
## [2,]    1 -0.9599799 0.3443619  0.1757498  0.1884550  0.2022358  0.1875591
## [3,]    2 -1.1098392 0.5139263  0.4302872  0.4843235  0.5459324  0.4826701
## [4,]    3 -1.0107860 1.0016898  0.4414726  0.4858718  0.5359647  0.4852324
## [5,]    4 -0.7590982 0.8016988  0.4222038  0.4663158  0.5161605  0.4650724
## 
## Results for correlation parameters 
##      Alpha   Estimate BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]     0 0.04900710 0.01927730 0.02019439 0.02116040 0.02017570
## [2,]     1 0.01868904 0.01986653 0.02078916 0.02175808 0.02078932
# MAEE for count outcome using Quasi-Poisson distribution
print(est_maee_ind_cnt_quasipoisson)
## GEE for correlated data 
##  Number of Iterations: 8 
## Results for marginal mean parameters 
##      Beta   Estimate MB-stderr BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]    0 -1.1519918 0.3131953  0.1667100  0.1746142  0.1829002  0.1752801
## [2,]    1 -0.9589155 0.3160741  0.1767755  0.1895442  0.2033914  0.1887524
## [3,]    2 -1.1016110 0.4675363  0.4245096  0.4772228  0.5371973  0.4754787
## [4,]    3 -1.0117241 0.8967909  0.4369452  0.4806150  0.5297870  0.4802468
## [5,]    4 -0.7575391 0.7177545  0.4163786  0.4595912  0.5083364  0.4578985
## 
## Results for correlation parameters 
##      Alpha   Estimate BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]     0 0.09147181 0.03137139 0.03294759 0.03461028 0.03292827
## [2,]     1 0.03231153 0.02918904 0.03056169 0.03200342 0.03056347
# GEE for count outcome using Quasi-Poisson distribution
print(est_uee_ind_cnt_quasipoisson)
## GEE for correlated data 
##  Number of Iterations: 6 
## Results for marginal mean parameters 
##      Beta   Estimate MB-stderr BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]    0 -1.1518131 0.3055318  0.1671095  0.1750714  0.1834202  0.1756915
## [2,]    1 -0.9582190 0.3073944  0.1760662  0.1887880  0.2025830  0.1880073
## [3,]    2 -1.1073986 0.4578675  0.4265375  0.4797512  0.5403581  0.4780556
## [4,]    3 -1.0112083 0.8856530  0.4384607  0.4823527  0.5318224  0.4818385
## [5,]    4 -0.7575221 0.7095470  0.4179946  0.4614577  0.5105308  0.4597734
## 
## Results for correlation parameters 
##      Alpha   Estimate BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]     0 0.06509981 0.02703596 0.02837863 0.02979514 0.02836241
## [2,]     1 0.02868138 0.02603577 0.02725574 0.02853701 0.02725674

We also consider the simple exchangeable correlation structure for the cluster-period responses, where we specify the Z matrix as a single column of 1’s. We implement the geemaee() function on both the continuous outcome and binary outcome, and consider both matrix-adjusted estimating equations (MAEE) with alpadj = TRUE and uncorrected generalized estimating equations (GEE) with alpadj = FALSE. For the shrink argument, we use the "ALPHA" method to tune step sizes and focus on using estimated variances in the correlation estimating equations rather than using unit variances by specifying makevone = FALSE.

sampleSWCRT <- sampleSWCRTSmall

### Individual-level id, period, outcome, and design matrix
id <- sampleSWCRT$id
period <- sampleSWCRT$period
X <- as.matrix(sampleSWCRT[, c("period1", "period2", "period3", "period4", "treatment")])
m <- as.matrix(table(id, period))
n <- dim(m)[1]
t <- dim(m)[2]

### design matrix for correlation parameters (simple exchangeable correlation structure)
Z <- createzCrossSec(m)
Z <- matrix(1, dim(Z)[1], 1)

### (1) Matrix-adjusted estimating equations and GEE
### on continuous outcome with nested exchangeable correlation structure

### MAEE
est_maee_ind_con <- geemaee(
  y = sampleSWCRT$y_con,
  X = X, id = id, Z = Z,
  family = "continuous",
  maxiter = 500, epsilon = 0.001,
  printrange = TRUE, alpadj = TRUE,
  shrink = "ALPHA", makevone = FALSE
)

### GEE
est_uee_ind_con <- geemaee(
  y = sampleSWCRT$y_con,
  X = X, id = id, Z = Z,
  family = "continuous",
  maxiter = 500, epsilon = 0.001,
  printrange = TRUE, alpadj = FALSE,
  shrink = "ALPHA", makevone = FALSE
)

### (2) Matrix-adjusted estimating equations and GEE
### on binary outcome with nested exchangeable correlation structure

### MAEE
est_maee_ind_bin <- geemaee(
  y = sampleSWCRT$y_bin,
  X = X, id = id, Z = Z,
  family = "binomial",
  maxiter = 500, epsilon = 0.001,
  printrange = TRUE, alpadj = TRUE,
  shrink = "ALPHA", makevone = FALSE
)


### GEE
est_uee_ind_bin <- geemaee(
  y = sampleSWCRT$y_bin,
  X = X, id = id, Z = Z,
  family = "binomial",
  maxiter = 500, epsilon = 0.001,
  printrange = TRUE, alpadj = FALSE,
  shrink = "ALPHA", makevone = FALSE
)

Then we have the following output:

# MAEE for continuous outcome
print(est_maee_ind_con)
## GEE for correlated data 
##  Number of Iterations: 6 
## Results for marginal mean parameters 
##      Beta   Estimate MB-stderr BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]    0  0.1546654 0.1138036  0.1132300  0.1183400  0.1236901  0.1183735
## [2,]    1  0.2575634 0.1286791  0.1577178  0.1688563  0.1808451  0.1686030
## [3,]    2  0.2655184 0.1588560  0.1005199  0.1085032  0.1172738  0.1091556
## [4,]    3  0.3254756 0.2066790  0.2188951  0.2353741  0.2532413  0.2385154
## [5,]    4 -0.1803555 0.1738138  0.1555551  0.1694989  0.1847540  0.1700911
## 
## Results for correlation parameters 
##      Alpha   Estimate BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]     0 0.04161855 0.01626681 0.01707934   0.017935 0.01710478
# GEE for continuous outcome
print(est_uee_ind_con)
## GEE for correlated data 
##  Number of Iterations: 4 
## Results for marginal mean parameters 
##      Beta   Estimate MB-stderr BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]    0  0.1548400 0.1119522  0.1132388  0.1183621  0.1237270  0.1184054
## [2,]    1  0.2586067 0.1269062  0.1578532  0.1690319  0.1810577  0.1685156
## [3,]    2  0.2685238 0.1570543  0.0989460  0.1068758  0.1155986  0.1066848
## [4,]    3  0.3288218 0.2048429  0.2153506  0.2314405  0.2488849  0.2334599
## [5,]    4 -0.1832029 0.1728612  0.1526576  0.1662888  0.1812008  0.1665749
## 
## Results for correlation parameters 
##      Alpha   Estimate BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]     0 0.03563676  0.0145224 0.01524472 0.01600543 0.01526096
# MAEE for binary outcome
print(est_maee_ind_bin)
## GEE for correlated data 
##  Number of Iterations: 6 
## Results for marginal mean parameters 
##      Beta   Estimate MB-stderr BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]    0 -0.7680520 0.2551979  0.2402408  0.2515042  0.2633056  0.2560060
## [2,]    1 -0.4922456 0.2801734  0.3139606  0.3377667  0.3639014  0.3395136
## [3,]    2 -0.6703178 0.3598533  0.5664151  0.6258843  0.6926011  0.6241206
## [4,]    3 -0.5918583 0.5167021  0.5636072  0.6189357  0.6805468  0.6320340
## [5,]    4 -0.9783470 0.4282247  0.5386478  0.5930714  0.6537034  0.5930284
## 
## Results for correlation parameters 
##      Alpha   Estimate BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]     0 0.04820458 0.02878404 0.03018486  0.0316581 0.03021869
# GEE for binary outcome
print(est_uee_ind_bin)
## GEE for correlated data 
##  Number of Iterations: 4 
## Results for marginal mean parameters 
##      Beta   Estimate MB-stderr BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]    0 -0.7682900 0.2492772  0.2407543  0.2520694  0.2639264  0.2557018
## [2,]    1 -0.4935666 0.2750558  0.3121951  0.3358868  0.3619068  0.3364044
## [3,]    2 -0.6710571 0.3546461  0.5707372  0.6310024  0.6986599  0.6278109
## [4,]    3 -0.5894580 0.5122764  0.5645206  0.6201814  0.6822145  0.6305004
## [5,]    4 -0.9820253 0.4257111  0.5412118  0.5961231  0.6573418  0.5951026
## 
## Results for correlation parameters 
##      Alpha   Estimate BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]     0 0.03937419 0.02544412 0.02668884 0.02799821 0.02670961

Simulated dataset with 12 clusters and 5 periods

period1 period2 period3 period4 period5 treatment id period y_bin y_con
1 0 0 0 0 0 1 1 1 0.6789756
1 0 0 0 0 0 1 1 1 0.3776961
1 0 0 0 0 0 1 1 1 -0.5519475
1 0 0 0 0 0 1 1 1 -1.3102658
1 0 0 0 0 0 1 1 1 0.2654200
1 0 0 0 0 0 1 1 1 -0.3327958

Using the nested exchangeable correlation structure, we have the following results.

########################################################################
### Example 2): simulated SW-CRT with larger cluster-period sizes (20~30)
########################################################################

## This will elapse longer.
sampleSWCRT <- sampleSWCRTLarge

### Individual-level id, period, outcome, and design matrix
id <- sampleSWCRT$id
period <- sampleSWCRT$period
X <- as.matrix(sampleSWCRT[, c("period1", "period2", "period3", "period4", "period5", "treatment")])
m <- as.matrix(table(id, period))
n <- dim(m)[1]
t <- dim(m)[2]
### design matrix for correlation parameters
Z <- createzCrossSec(m)

### (1) Matrix-adjusted estimating equations and GEE
### on continous outcome with nested exchangeable correlation structure

### MAEE
est_maee_ind_con <- geemaee(
  y = sampleSWCRT$y_con,
  X = X, id = id, Z = Z,
  family = "continuous",
  maxiter = 500, epsilon = 0.001,
  printrange = TRUE, alpadj = TRUE,
  shrink = "ALPHA", makevone = FALSE
)
print(est_maee_ind_con)
## GEE for correlated data 
##  Number of Iterations: 6 
## Results for marginal mean parameters 
##      Beta    Estimate MB-stderr BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]    0  0.19231933 0.1048518 0.13134291 0.13722549 0.14337189 0.13681137
## [2,]    1  0.13855318 0.1089971 0.08559214 0.09070281 0.09631631 0.08746208
## [3,]    2  0.14135330 0.1246032 0.09471802 0.10092099 0.10769442 0.10114962
## [4,]    3  0.20421027 0.1452583 0.10146597 0.11068698 0.12096977 0.11160100
## [5,]    4  0.24716971 0.1684305 0.17177784 0.18728285 0.20433337 0.18746260
## [6,]    5 -0.08406266 0.1322349 0.13858326 0.15310151 0.16917367 0.15161045
## 
## Results for correlation parameters 
##      Alpha   Estimate BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]     0 0.09465688 0.02442148 0.02594067 0.02764751 0.02572805
## [2,]     1 0.03469533 0.01939519 0.02047886 0.02163615 0.02033902
### GEE
est_uee_ind_con <- geemaee(
  y = sampleSWCRT$y_con,
  X = X, id = id, Z = Z,
  family = "continuous",
  maxiter = 500, epsilon = 0.001,
  printrange = TRUE, alpadj = FALSE,
  shrink = "ALPHA", makevone = FALSE
)
print(est_uee_ind_con)
## GEE for correlated data 
##  Number of Iterations: 4 
## Results for marginal mean parameters 
##      Beta    Estimate MB-stderr BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]    0  0.19190139 0.0996732 0.13145806  0.1373518  0.1435101  0.1369554
## [2,]    1  0.13814149 0.1035213 0.08569554  0.0908225  0.0964508  0.0875894
## [3,]    2  0.14064776 0.1185910 0.09464091  0.1008293  0.1075822  0.1010819
## [4,]    3  0.20346170 0.1383136 0.10102922  0.1101788  0.1203834  0.1111059
## [5,]    4  0.24633487 0.1602727 0.17211627  0.1876575  0.2047488  0.1878631
## [6,]    5 -0.08318282 0.1259570 0.13859217  0.1531209  0.1692063  0.1516836
## 
## Results for correlation parameters 
##      Alpha   Estimate BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]     0 0.08146407 0.02277749 0.02412259 0.02563971 0.02397541
## [2,]     1 0.03076420 0.01819528 0.01919418 0.02026185 0.01907859
### (2) Matrix-adjusted estimating equations and GEE
### on binary outcome with nested exchangeable correlation structure

### MAEE
est_maee_ind_bin <- geemaee(
  y = sampleSWCRT$y_bin,
  X = X, id = id, Z = Z,
  family = "binomial",
  maxiter = 500, epsilon = 0.001,
  printrange = TRUE, alpadj = TRUE,
  shrink = "ALPHA", makevone = FALSE
)
print(est_maee_ind_bin)
## GEE for correlated data 
##  Number of Iterations: 7 
## Results for marginal mean parameters 
##      Beta    Estimate MB-stderr BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]    0 -0.03457213 0.2241538  0.2392075  0.2498653  0.2610001  0.2473707
## [2,]    1 -0.34208716 0.2377506  0.2099319  0.2248003  0.2408185  0.2234759
## [3,]    2 -0.67675480 0.2823888  0.2743190  0.2961203  0.3199988  0.2912959
## [4,]    3  0.02997130 0.3119529  0.2000433  0.2183246  0.2399443  0.2115277
## [5,]    4 -0.42930737 0.3863814  0.1803455  0.1973792  0.2176510  0.1991261
## [6,]    5 -0.81816822 0.2919980  0.2800864  0.3069737  0.3365978  0.3032212
## 
## Results for correlation parameters 
##      Alpha   Estimate BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]     0 0.11496936 0.04117568 0.04294094 0.04478275 0.04281833
## [2,]     1 0.06242691 0.03481106 0.03630878 0.03787122 0.03624330
### GEE
est_uee_ind_bin <- geemaee(
  y = sampleSWCRT$y_bin,
  X = X, id = id, Z = Z,
  family = "binomial",
  maxiter = 500, epsilon = 0.001,
  printrange = TRUE, alpadj = FALSE,
  shrink = "ALPHA", makevone = FALSE
)
print(est_uee_ind_bin)
## GEE for correlated data 
##  Number of Iterations: 4 
## Results for marginal mean parameters 
##      Beta    Estimate MB-stderr BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]    0 -0.03532021 0.2130112  0.2391542  0.2498171  0.2609580  0.2472393
## [2,]    1 -0.34154512 0.2256323  0.2100181  0.2248984  0.2409307  0.2236170
## [3,]    2 -0.67644339 0.2683572  0.2745740  0.2964326  0.3203797  0.2916198
## [4,]    3  0.03090417 0.2963087  0.2002704  0.2186117  0.2403231  0.2117340
## [5,]    4 -0.43011798 0.3665889  0.1798608  0.1969091  0.2172283  0.1986538
## [6,]    5 -0.81865546 0.2768705  0.2801134  0.3070590  0.3367553  0.3032859
## 
## Results for correlation parameters 
##      Alpha   Estimate BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]     0 0.09974917 0.03709677 0.03868651 0.04034505 0.03857485
## [2,]     1 0.05686766 0.03128090 0.03262829 0.03403398 0.03256722

Using the simple exchangeable correlation structure, we have the following results.

sampleSWCRT <- sampleSWCRTLarge

### Individual-level id, period, outcome, and design matrix
id <- sampleSWCRT$id
period <- sampleSWCRT$period
X <- as.matrix(sampleSWCRT[, c("period1", "period2", "period3", "period4", "period5", "treatment")])
m <- as.matrix(table(id, period))
n <- dim(m)[1]
t <- dim(m)[2]
### design matrix for correlation parameters
Z <- createzCrossSec(m)
Z <- matrix(1, dim(Z)[1], 1)

### (1) Matrix-adjusted estimating equations and GEE
### on continous outcome with nested exchangeable correlation structure

### MAEE
est_maee_ind_con <- geemaee(
  y = sampleSWCRT$y_con,
  X = X, id = id, Z = Z,
  family = "continuous",
  maxiter = 500, epsilon = 0.001,
  printrange = TRUE, alpadj = TRUE,
  shrink = "ALPHA", makevone = FALSE
)
print(est_maee_ind_con)
## GEE for correlated data 
##  Number of Iterations: 5 
## Results for marginal mean parameters 
##      Beta    Estimate  MB-stderr BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]    0  0.18627032 0.08414604 0.13279197 0.13883272 0.14515097 0.13737503
## [2,]    1  0.13815534 0.08568755 0.08728350 0.09253851 0.09827924 0.08831436
## [3,]    2  0.14554680 0.09591310 0.09561107 0.10175525 0.10851515 0.10198254
## [4,]    3  0.20689929 0.10841530 0.09868192 0.10735199 0.11704766 0.10988201
## [5,]    4  0.25610889 0.12165483 0.17022286 0.18525460 0.20174120 0.19006405
## [6,]    5 -0.09300219 0.08850272 0.13636987 0.15009052 0.16522319 0.14914568
## 
## Results for correlation parameters 
##      Alpha   Estimate BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]     0 0.04611516  0.0213304 0.02261646  0.0239952  0.0222351
### GEE
est_uee_ind_con <- geemaee(
  y = sampleSWCRT$y_con,
  X = X, id = id, Z = Z,
  family = "continuous",
  maxiter = 500, epsilon = 0.001,
  printrange = TRUE, alpadj = FALSE,
  shrink = "ALPHA", makevone = FALSE
)
print(est_uee_ind_con)
## GEE for correlated data 
##  Number of Iterations: 4 
## Results for marginal mean parameters 
##      Beta    Estimate  MB-stderr BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]    0  0.18623550 0.08157593 0.13280807 0.13885009 0.14516968 0.13758490
## [2,]    1  0.13775315 0.08314994 0.08726205 0.09252299 0.09826906 0.08851174
## [3,]    2  0.14459742 0.09363651 0.09541116 0.10153431 0.10826320 0.10169514
## [4,]    3  0.20603548 0.10636160 0.09830339 0.10692622 0.11656789 0.10913963
## [5,]    4  0.25474380 0.11977319 0.17041119 0.18547805 0.20200601 0.18958752
## [6,]    5 -0.09160689 0.08834620 0.13623821 0.14998489 0.16515087 0.14894388
## 
## Results for correlation parameters 
##      Alpha  Estimate BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]     0 0.0408165 0.02379864 0.02536501 0.02705457  0.0247663
### (2) Matrix-adjusted estimating equations and GEE
### on binary outcome with nested exchangeable correlation structure

### MAEE
est_maee_ind_bin <- geemaee(
  y = sampleSWCRT$y_bin,
  X = X, id = id, Z = Z,
  family = "binomial",
  maxiter = 500, epsilon = 0.001,
  printrange = TRUE, alpadj = TRUE,
  shrink = "ALPHA", makevone = FALSE
)
print(est_maee_ind_bin)
## GEE for correlated data 
##  Number of Iterations: 5 
## Results for marginal mean parameters 
##      Beta    Estimate MB-stderr BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]    0 -0.03973392 0.1901734  0.2375264  0.2482506  0.2594763  0.2402294
## [2,]    1 -0.33441396 0.1969079  0.2191302  0.2343650  0.2508029  0.2379983
## [3,]    2 -0.67365821 0.2291302  0.2926954  0.3160152  0.3416383  0.3157696
## [4,]    3  0.04551938 0.2377877  0.2306274  0.2508583  0.2745651  0.2450061
## [5,]    4 -0.43406855 0.2864244  0.1708418  0.1874952  0.2077659  0.1954336
## [6,]    5 -0.82474970 0.1978452  0.2759903  0.3032872  0.3334597  0.3001580
## 
## Results for correlation parameters 
##      Alpha   Estimate BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]     0 0.07132474 0.03467243 0.03613621 0.03766221 0.03596639
### GEE
est_uee_ind_bin <- geemaee(
  y = sampleSWCRT$y_bin,
  X = X, id = id, Z = Z,
  family = "binomial",
  maxiter = 500, epsilon = 0.001,
  printrange = TRUE, alpadj = FALSE,
  shrink = "ALPHA", makevone = FALSE
)
print(est_uee_ind_bin)
## GEE for correlated data 
##  Number of Iterations: 4 
## Results for marginal mean parameters 
##      Beta    Estimate MB-stderr BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]    0 -0.04019172 0.1849222  0.2377030  0.2484296  0.2596563  0.2411474
## [2,]    1 -0.33482763 0.1916991  0.2182129  0.2334232  0.2498335  0.2363416
## [3,]    2 -0.67397738 0.2240146  0.2909303  0.3141572  0.3396799  0.3131966
## [4,]    3  0.04476860 0.2342217  0.2278055  0.2479068  0.2715079  0.2418171
## [5,]    4 -0.43477643 0.2828052  0.1707430  0.1874088  0.2076964  0.1944779
## [6,]    5 -0.82448172 0.1978906  0.2761830  0.3034841  0.3336617  0.3003341
## 
## Results for correlation parameters 
##      Alpha   Estimate BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]     0 0.06511568 0.03162665 0.03296549 0.03436137 0.03282115

geemaee() examples: matrix-adjusted GEE for estimating the mean and correlation parameters in SW-CRTs when some clusters had only 1 observation

We can use the geemaee() function for SW-CRTs when some clusters had only one observation.

Simulated dataset with 12 clusters and small cluster sizes

## let the cluster 5 and cluster 10 be with only 1 observation
sampleSWCRT <- sampleSWCRTSmall[sampleSWCRTSmall$id != 5 & sampleSWCRTSmall$id != 10, ]
sampleSWCRT5 <- sampleSWCRTSmall[sampleSWCRTSmall$id == 5, ][1, ]
sampleSWCRT10 <- sampleSWCRTSmall[sampleSWCRTSmall$id == 10, ][1, ]
sampleSWCRT <- rbind(sampleSWCRT, sampleSWCRT5)
sampleSWCRT <- rbind(sampleSWCRT, sampleSWCRT10)
sampleSWCRT <- sampleSWCRT[order(sampleSWCRT$id), ]

id <- sampleSWCRT$id
period <- sampleSWCRT$period
X <- as.matrix(sampleSWCRT[, c("period1", "period2", "period3", "period4", "treatment")])
m <- as.matrix(table(id, period))
n <- dim(m)[1]
t <- dim(m)[2]

## function to generate Z matrix (ignore the clusters with only 1 observation)
createzCrossSecIgnoreOneObsCount <- function(m) {
  Z <- NULL
  n <- dim(m)[1]

  for (i in 1:n) {
    alpha_0 <- 1
    alpha_1 <- 2
    n_i <- c(m[i, ])
    n_i <- c(n_i[n_i > 0])
    n_length <- length(n_i)
    POS <- matrix(alpha_1, sum(n_i), sum(n_i))
    loc1 <- 0
    loc2 <- 0

    for (s in 1:n_length) {
      n_t <- n_i[s]
      loc1 <- loc2 + 1
      loc2 <- loc1 + n_t - 1

      for (k in loc1:loc2) {
        for (j in loc1:loc2) {
          if (k != j) {
            POS[k, j] <- alpha_0
          } else {
            POS[k, j] <- 0
          }
        }
      }
    }

    zrow <- diag(2)
    z_c <- NULL

    if (sum(n_i) > 1) {
      for (j in 1:(sum(n_i) - 1)) {
        for (k in (j + 1):sum(n_i)) {
          z_c <- rbind(z_c, zrow[POS[j, k], ])
        }
      }
    }

    Z <- rbind(Z, z_c)
  }

  return(Z)
}
Z <- createzCrossSecIgnoreOneObsCount(m)

### (1) Matrix-adjusted estimating equations and GEE
### on continuous outcome with nested exchangeable correlation structure

### MAEE
est_maee_ind_con <- geemaee(
  y = sampleSWCRT$y_con,
  X = X, id = id, Z = Z,
  family = "continuous",
  maxiter = 500, epsilon = 0.001,
  printrange = TRUE, alpadj = TRUE,
  shrink = "ALPHA", makevone = FALSE
)

### GEE
est_uee_ind_con <- geemaee(
  y = sampleSWCRT$y_con,
  X = X, id = id, Z = Z,
  family = "continuous",
  maxiter = 500, epsilon = 0.001,
  printrange = TRUE, alpadj = FALSE,
  shrink = "ALPHA", makevone = FALSE
)

### (2) Matrix-adjusted estimating equations and GEE
### on binary outcome with nested exchangeable correlation structure

### MAEE
est_maee_ind_bin <- geemaee(
  y = sampleSWCRT$y_bin,
  X = X, id = id, Z = Z,
  family = "binomial",
  maxiter = 500, epsilon = 0.001,
  printrange = TRUE, alpadj = TRUE,
  shrink = "ALPHA", makevone = FALSE
)

### GEE
est_uee_ind_bin <- geemaee(
  y = sampleSWCRT$y_bin,
  X = X, id = id, Z = Z,
  family = "binomial",
  maxiter = 500, epsilon = 0.001,
  printrange = TRUE, alpadj = FALSE,
  shrink = "ALPHA", makevone = FALSE
)

Then we have the following output:

# MAEE for continuous outcome
print(est_maee_ind_con)
## GEE for correlated data 
##  Number of Iterations: 7 
## Results for marginal mean parameters 
##      Beta   Estimate MB-stderr BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]    0  0.1348986 0.1291859  0.1697842  0.1826380  0.1965769  0.1798168
## [2,]    1  0.2713850 0.1746512  0.2001488  0.2195470  0.2408760  0.2172481
## [3,]    2  0.2242936 0.2143205  0.1243187  0.1373906  0.1523610  0.1347270
## [4,]    3  0.3421334 0.2696461  0.2414682  0.2634955  0.2879108  0.2629327
## [5,]    4 -0.1824207 0.2269258  0.1775156  0.1973096  0.2194501  0.1954393
## 
## Results for correlation parameters 
##      Alpha   Estimate BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]     0 0.11814099  2.1459619  2.3092102  2.4863313  2.2717905
## [2,]     1 0.04415164  0.6888222  0.7410199  0.7976414  0.7290764
# GEE for continuous outcome
print(est_uee_ind_con)
## GEE for correlated data 
##  Number of Iterations: 4 
## Results for marginal mean parameters 
##      Beta   Estimate MB-stderr BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]    0  0.1377223 0.1237580  0.1615739  0.1722903  0.1837689  0.1702791
## [2,]    1  0.2723025 0.1649076  0.2018461  0.2215163  0.2431605  0.2194440
## [3,]    2  0.2235232 0.2018022  0.1238738  0.1368429  0.1516911  0.1346751
## [4,]    3  0.3454827 0.2542602  0.2435906  0.2659183  0.2906703  0.2662424
## [5,]    4 -0.1829646 0.2143018  0.1784264  0.1984112  0.2207730  0.1968959
## 
## Results for correlation parameters 
##      Alpha   Estimate BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]     0 0.08866912  0.7162528  0.7860288  0.8628203  0.7786015
## [2,]     1 0.04087645  0.3630319  0.3979786  0.4364026  0.3943828
# MAEE for binary outcome
print(est_maee_ind_bin)
## GEE for correlated data 
##  Number of Iterations: 9 
## Results for marginal mean parameters 
##      Beta   Estimate MB-stderr BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]    0 -0.7177697 0.3243579  0.2504724  0.2626788  0.2755610  0.2670747
## [2,]    1 -0.5057164 0.3866767  0.4184022  0.4634919  0.5148205  0.4598584
## [3,]    2 -0.7507158 0.5041838  0.7566394  0.8607488  0.9814278  0.8526778
## [4,]    3 -0.4215410 0.6903277  0.6457526  0.7317282  0.8307258  0.7387263
## [5,]    4 -1.1000030 0.5788407  0.6358341  0.7220682  0.8212615  0.7152954
## 
## Results for correlation parameters 
##      Alpha   Estimate BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]     0 0.12872351 0.04214012 0.04501392 0.04811026 0.04504987
## [2,]     1 0.04998133 0.03776135 0.04003308 0.04245568 0.04005637
# GEE for binary outcome
print(est_uee_ind_bin)
## GEE for correlated data 
##  Number of Iterations: 5 
## Results for marginal mean parameters 
##      Beta   Estimate MB-stderr BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]    0 -0.7188014 0.3023277  0.2496582  0.2622189  0.2754855  0.2667267
## [2,]    1 -0.5039354 0.3602724  0.4176725  0.4625818  0.5137026  0.4594518
## [3,]    2 -0.7763118 0.4698037  0.7586941  0.8634678  0.9850709  0.8557903
## [4,]    3 -0.4189338 0.6440289  0.6529553  0.7396359  0.8395257  0.7474897
## [5,]    4 -1.0977160 0.5424980  0.6395300  0.7262417  0.8260927  0.7200144
## 
## Results for correlation parameters 
##      Alpha   Estimate BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]     0 0.09112146 0.03407652 0.03633943 0.03877574 0.03634703
## [2,]     1 0.04312784 0.03280064 0.03476699 0.03686398 0.03478916

Simulated dataset with 12 clusters and large cluster sizes

## let the cluster 2, 3 be with only 1 observation
sampleSWCRT <- sampleSWCRTLarge[sampleSWCRTLarge$id != 2 & sampleSWCRTLarge$id != 3, ]
sampleSWCRT2 <- sampleSWCRTLarge[sampleSWCRTLarge$id == 2 & sampleSWCRTLarge$period == 4, ][1, ]
sampleSWCRT3 <- sampleSWCRTLarge[sampleSWCRTLarge$id == 3 & sampleSWCRTLarge$period == 5, ][1, ]
sampleSWCRT <- rbind(sampleSWCRT, sampleSWCRT2)
sampleSWCRT <- rbind(sampleSWCRT, sampleSWCRT3)
sampleSWCRT <- sampleSWCRT[order(sampleSWCRT$id, sampleSWCRT$period), ]

id <- sampleSWCRT$id
period <- sampleSWCRT$period
X <- as.matrix(sampleSWCRT[, c("period1", "period2", "period3", "period4", "treatment")])
m <- as.matrix(table(id, period))
n <- dim(m)[1]
t <- dim(m)[2]
Z <- createzCrossSecIgnoreOneObsCount(m)

### (1) Matrix-adjusted estimating equations and GEE
### on continous outcome with nested exchangeable correlation structure

### MAEE
est_maee_ind_con <- geemaee(
  y = sampleSWCRT$y_con,
  X = X, id = id, Z = Z,
  family = "continuous",
  maxiter = 500, epsilon = 0.001,
  printrange = TRUE, alpadj = TRUE,
  shrink = "ALPHA", makevone = FALSE
)
print(est_maee_ind_con)
## GEE for correlated data 
##  Number of Iterations: 6 
## Results for marginal mean parameters 
##      Beta    Estimate  MB-stderr BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]    0  0.13025902 0.10414063 0.14548511 0.15353994 0.16204526 0.15345862
## [2,]    1  0.12125392 0.10194268 0.08757724 0.09236968 0.09743277 0.09233367
## [3,]    2 -0.01152763 0.10473717 0.07926497 0.08413526 0.08937669 0.08354342
## [4,]    3  0.05478654 0.10981242 0.08594124 0.09157935 0.09765350 0.09060846
## [5,]    4  0.12204008 0.08165944 0.06041418 0.06465424 0.06933952 0.06347192
## 
## Results for correlation parameters 
##      Alpha   Estimate BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]     0 0.07157622 0.01991177 0.02113983 0.02245462 0.02103462
## [2,]     1 0.02153243 0.01752621 0.01849796 0.01952503 0.01850074
### GEE
est_uee_ind_con <- geemaee(
  y = sampleSWCRT$y_con,
  X = X, id = id, Z = Z,
  family = "continuous",
  maxiter = 500, epsilon = 0.001,
  printrange = TRUE, alpadj = FALSE,
  shrink = "ALPHA", makevone = FALSE
)
print(est_uee_ind_con)
## GEE for correlated data 
##  Number of Iterations: 5 
## Results for marginal mean parameters 
##      Beta    Estimate  MB-stderr BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]    0  0.12957321 0.09939525 0.14566793 0.15374768 0.16228061 0.15366905
## [2,]    1  0.12114456 0.09713322 0.08772912 0.09254067 0.09762440 0.09250431
## [3,]    2 -0.01223224 0.09990992 0.07909036 0.08393932 0.08915773 0.08334936
## [4,]    3  0.05415555 0.10491006 0.08545354 0.09102402 0.09702424 0.09006645
## [5,]    4  0.12265071 0.07812056 0.06039500 0.06464649 0.06934418 0.06347013
## 
## Results for correlation parameters 
##      Alpha   Estimate BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]     0 0.06145065 0.04021457 0.04271976 0.04541452 0.04241073
## [2,]     1 0.01997608 0.02157226 0.02280360 0.02411449 0.02274367
### (2) Matrix-adjusted estimating equations and GEE
### on binary outcome with nested exchangeable correlation structure

### MAEE
est_maee_ind_bin <- geemaee(
  y = sampleSWCRT$y_bin,
  X = X, id = id, Z = Z,
  family = "binomial",
  maxiter = 500, epsilon = 0.001,
  printrange = TRUE, alpadj = TRUE,
  shrink = "ALPHA", makevone = FALSE
)
print(est_maee_ind_bin)
## GEE for correlated data 
##  Number of Iterations: 7 
## Results for marginal mean parameters 
##      Beta   Estimate MB-stderr BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]    0  0.1332629 0.2420923  0.2798308  0.2962875  0.3137376  0.2943275
## [2,]    1 -0.1804980 0.2392136  0.2110722  0.2241744  0.2381715  0.2225689
## [3,]    2 -0.5076878 0.2584549  0.2920199  0.3125656  0.3347155  0.3087142
## [4,]    3  0.3840753 0.2504374  0.1895220  0.2003438  0.2121205  0.1992634
## [5,]    4 -1.0262224 0.2207822  0.2506467  0.2692117  0.2893322  0.2650143
## 
## Results for correlation parameters 
##      Alpha  Estimate BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]     0 0.1169820 0.04023116 0.04231185 0.04450122 0.04223456
## [2,]     1 0.0568873 0.03534790 0.03723443 0.03922241 0.03719636
### GEE
est_uee_ind_bin <- geemaee(
  y = sampleSWCRT$y_bin,
  X = X, id = id, Z = Z,
  family = "binomial",
  maxiter = 500, epsilon = 0.001,
  printrange = TRUE, alpadj = FALSE,
  shrink = "ALPHA", makevone = FALSE
)
print(est_uee_ind_bin)
## GEE for correlated data 
##  Number of Iterations: 5 
## Results for marginal mean parameters 
##      Beta   Estimate MB-stderr BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]    0  0.1342372 0.2291780  0.2799810  0.2964751  0.3139675  0.2944929
## [2,]    1 -0.1786338 0.2261284  0.2107070  0.2237640  0.2377126  0.2221467
## [3,]    2 -0.5060056 0.2446274  0.2918120  0.3123570  0.3345099  0.3084656
## [4,]    3  0.3879678 0.2371757  0.1886817  0.1994177  0.2111038  0.1983400
## [5,]    4 -1.0247733 0.2089581  0.2503983  0.2689825  0.2891230  0.2647720
## 
## Results for correlation parameters 
##      Alpha   Estimate BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]     0 0.10039349 0.03525647 0.03708828 0.03901615 0.03702046
## [2,]     1 0.05101663 0.03088627 0.03254034 0.03428364 0.03250674

geemaee() examples: matrix-adjusted GEE for estimating the mean and correlation parameters in general CRTs

We can also use the geemaee() function for general CRTs with only one time period. We need to specify a single column of 1’s for the Z matrix.

Simulated dataset with 12 clusters and small cluster sizes

### use only the second period data to get a general CRT data
sampleSWCRT <- sampleSWCRTSmall[sampleSWCRTSmall$period == 2, ]

### Individual-level id, period, outcome, and design matrix
id <- sampleSWCRT$id
period <- sampleSWCRT$period
X <- as.matrix(sampleSWCRT[, c("period2", "treatment")])
m <- as.matrix(table(id, period))
n <- dim(m)[1]
t <- dim(m)[2]

### design matrix for correlation parameters
Z <- createzCrossSec(m)
Z <- matrix(1, dim(Z)[1], 1)

### (1) Matrix-adjusted estimating equations and GEE
### on continuous outcome with nested exchangeable correlation structure

### MAEE
est_maee_ind_con <- geemaee(
  y = sampleSWCRT$y_con,
  X = X, id = id, Z = Z,
  family = "continuous",
  maxiter = 500, epsilon = 0.001,
  printrange = TRUE, alpadj = TRUE,
  shrink = "ALPHA", makevone = FALSE
)

### GEE
est_uee_ind_con <- geemaee(
  y = sampleSWCRT$y_con,
  X = X, id = id, Z = Z,
  family = "continuous",
  maxiter = 500, epsilon = 0.001,
  printrange = TRUE, alpadj = FALSE,
  shrink = "ALPHA", makevone = FALSE
)

### (2) Matrix-adjusted estimating equations and GEE
### on binary outcome with nested exchangeable correlation structure

### MAEE
est_maee_ind_bin <- geemaee(
  y = sampleSWCRT$y_bin,
  X = X, id = id, Z = Z,
  family = "binomial",
  maxiter = 500, epsilon = 0.001,
  printrange = TRUE, alpadj = TRUE,
  shrink = "ALPHA", makevone = FALSE
)

### GEE
est_uee_ind_bin <- geemaee(
  y = sampleSWCRT$y_bin,
  X = X, id = id, Z = Z,
  family = "binomial",
  maxiter = 500, epsilon = 0.001,
  printrange = TRUE, alpadj = FALSE,
  shrink = "ALPHA", makevone = FALSE
)

Then we have the following output:

# MAEE for continuous outcome
print(est_maee_ind_con)
## GEE for correlated data 
##  Number of Iterations: 8 
## Results for marginal mean parameters 
##      Beta   Estimate MB-stderr BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]    0  0.2528124 0.1764069  0.1665441  0.1786009  0.1915347  0.1788929
## [2,]    1 -0.1552832 0.3252919  0.2292350  0.2550014  0.2844740  0.2623835
## 
## Results for correlation parameters 
##      Alpha  Estimate BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]     0 0.1431597  0.1247514  0.1349427  0.1462297  0.1371536
# GEE for continuous outcome
print(est_uee_ind_con)
## GEE for correlated data 
##  Number of Iterations: 5 
## Results for marginal mean parameters 
##      Beta   Estimate MB-stderr BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]    0  0.2531152 0.1630992  0.1679072  0.1801949  0.1933876  0.1804643
## [2,]    1 -0.1541036 0.3048830  0.2302462  0.2561750  0.2858361  0.2632618
## 
## Results for correlation parameters 
##      Alpha  Estimate BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]     0 0.1008046  0.1028606  0.1116841  0.1215172  0.1130576
# MAEE for binary outcome
print(est_maee_ind_bin)
## GEE for correlated data 
##  Number of Iterations: 7 
## Results for marginal mean parameters 
##      Beta   Estimate MB-stderr BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]    0 -0.4629654 0.3016316  0.2796365  0.2988104  0.3193427  0.2994913
## [2,]    1 -1.3155222 0.7460472  0.7225963  0.8252954  0.9440864  0.8440973
## 
## Results for correlation parameters 
##      Alpha   Estimate BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]     0 0.04877651  0.0773374 0.08170456 0.08638036 0.08171905
# GEE for binary outcome
print(est_uee_ind_bin)
## GEE for correlated data 
##  Number of Iterations: 4 
## Results for marginal mean parameters 
##      Beta   Estimate MB-stderr BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]    0 -0.4615411 0.2795648  0.2783017  0.2974810  0.3180402  0.2981008
## [2,]    1 -1.3237328 0.7077150  0.7244316  0.8276765  0.9471870  0.8455774
## 
## Results for correlation parameters 
##      Alpha   Estimate BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]     0 0.02148542 0.06797361 0.07165351   0.075578  0.0716616

Simulated dataset with 12 clusters and large cluster sizes

### use only the second period data to get a general CRT data
sampleSWCRT <- sampleSWCRTLarge[sampleSWCRTLarge$period == 2, ]

### Individual-level id, period, outcome, and design matrix
id <- sampleSWCRT$id
period <- sampleSWCRT$period
X <- as.matrix(sampleSWCRT[, c("period2", "treatment")])
m <- as.matrix(table(id, period))
n <- dim(m)[1]
t <- dim(m)[2]
### design matrix for correlation parameters
Z <- createzCrossSec(m)
Z <- matrix(1, dim(Z)[1], 1)

### (1) Matrix-adjusted estimating equations and GEE
### on continous outcome with nested exchangeable correlation structure

### MAEE
est_maee_ind_con <- geemaee(
  y = sampleSWCRT$y_con,
  X = X, id = id, Z = Z,
  family = "continuous",
  maxiter = 500, epsilon = 0.001,
  printrange = TRUE, alpadj = TRUE,
  shrink = "ALPHA", makevone = FALSE
)
print(est_maee_ind_con)
## GEE for correlated data 
##  Number of Iterations: 12 
## Results for marginal mean parameters 
##      Beta    Estimate MB-stderr BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]    0  0.12860277 0.1447396 0.09617971  0.1020767  0.1083354  0.1048829
## [2,]    1 -0.04282951 0.2910552 0.33801997  0.4102953  0.4992432  0.4336775
## 
## Results for correlation parameters 
##      Alpha  Estimate BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]     0 0.1379972 0.09222832  0.1022084  0.1135292    0.10296
### GEE
est_uee_ind_con <- geemaee(
  y = sampleSWCRT$y_con,
  X = X, id = id, Z = Z,
  family = "continuous",
  maxiter = 500, epsilon = 0.001,
  printrange = TRUE, alpadj = FALSE,
  shrink = "ALPHA", makevone = FALSE
)
print(est_uee_ind_con)
## GEE for correlated data 
##  Number of Iterations: 5 
## Results for marginal mean parameters 
##      Beta    Estimate MB-stderr BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]    0  0.12846953 0.1260780 0.09652287  0.1024661  0.1087756  0.1052561
## [2,]    1 -0.04426251 0.2540746 0.33870940  0.4113183  0.5007251  0.4346729
## 
## Results for correlation parameters 
##      Alpha   Estimate BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]     0 0.09523824 0.05805399 0.06430917 0.07142729 0.06487122
### (2) Matrix-adjusted estimating equations and GEE
### on binary outcome with nested exchangeable correlation structure

### MAEE
est_maee_ind_bin <- geemaee(
  y = sampleSWCRT$y_bin,
  X = X, id = id, Z = Z,
  family = "binomial",
  maxiter = 500, epsilon = 0.001,
  printrange = TRUE, alpadj = TRUE,
  shrink = "ALPHA", makevone = FALSE
)
print(est_maee_ind_bin)
## GEE for correlated data 
##  Number of Iterations: 8 
## Results for marginal mean parameters 
##      Beta  Estimate MB-stderr BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]    0 -0.334771 0.2118428  0.1862207  0.1973557  0.2091581  0.1990438
## [2,]    1 -0.875169 0.4848450  0.5284308  0.6355811  0.7671642  0.6602194
## 
## Results for correlation parameters 
##      Alpha   Estimate BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]     0 0.06293992 0.03076799 0.03204556 0.03341447 0.03206196
### GEE
est_uee_ind_bin <- geemaee(
  y = sampleSWCRT$y_bin,
  X = X, id = id, Z = Z,
  family = "binomial",
  maxiter = 500, epsilon = 0.001,
  printrange = TRUE, alpadj = FALSE,
  shrink = "ALPHA", makevone = FALSE
)
print(est_uee_ind_bin)
## GEE for correlated data 
##  Number of Iterations: 4 
## Results for marginal mean parameters 
##      Beta   Estimate MB-stderr BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]    0 -0.3336916 0.1938905  0.1855933  0.1966610  0.2083911  0.1983170
## [2,]    1 -0.8762363 0.4449063  0.5272264  0.6338465  0.7647076  0.6582073
## 
## Results for correlation parameters 
##      Alpha  Estimate BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]     0 0.0464299 0.02526121 0.02634733  0.0275056 0.02634871

geemaee() examples: matrix-adjusted GEE for estimating the mean and correlation parameters in SW-CRTs with all cluster-period sizes being 1

We consider an edge case when all cluster period sizes are 1, the geemaee() function can still estimate the intra-period correlation parameter.

### Simulated dataset with 12 clusters and 5 periods
# use the first observation for each cluster-period combination
sampleSWCRT <- NULL
for (i in 1:12) {
  for (j in 1:5) {
    row <- sampleSWCRTLarge[sampleSWCRTLarge$id == i &
      sampleSWCRTLarge$period == j, ][1, , drop = FALSE]
    sampleSWCRT <- rbind(sampleSWCRT, row)
  }
}

### Individual-level id, period, outcome, and design matrix
id <- sampleSWCRT$id
period <- sampleSWCRT$period
X <- as.matrix(sampleSWCRT[, c("period2", "treatment")])
m <- as.matrix(table(id, period))
n <- dim(m)[1]
t <- dim(m)[2]
### design matrix for correlation parameters
Z <- createzCrossSec(m)
Z <- matrix(1, dim(Z)[1], 1)

### (1) Matrix-adjusted estimating equations and GEE
### on continous outcome with nested exchangeable correlation structure

### MAEE
est_maee_ind_con <- geemaee(
  y = sampleSWCRT$y_con,
  X = X, id = id, Z = Z,
  family = "continuous",
  maxiter = 500, epsilon = 0.001,
  printrange = TRUE, alpadj = TRUE,
  shrink = "ALPHA", makevone = FALSE
)
print(est_maee_ind_con)
## GEE for correlated data 
##  Number of Iterations: 5 
## Results for marginal mean parameters 
##      Beta   Estimate MB-stderr BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]    0  0.5226064 0.2664221  0.2943763  0.3079198  0.3221030  0.3074785
## [2,]    1 -0.1387930 0.1738662  0.1039812  0.1084762  0.1133085  0.1082068
## 
## Results for correlation parameters 
##      Alpha   Estimate BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]     0 0.03343471 0.05337304 0.05574596 0.05822439 0.05574296
### GEE
est_uee_ind_con <- geemaee(
  y = sampleSWCRT$y_con,
  X = X, id = id, Z = Z,
  family = "continuous",
  maxiter = 500, epsilon = 0.001,
  printrange = TRUE, alpadj = FALSE,
  shrink = "ALPHA", makevone = FALSE
)
print(est_uee_ind_con)
## GEE for correlated data 
##  Number of Iterations: 4 
## Results for marginal mean parameters 
##      Beta   Estimate MB-stderr BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]    0  0.5214368 0.2666982  0.2946670  0.3082204  0.3224140  0.3077816
## [2,]    1 -0.1374308 0.1734077  0.1040098  0.1085099  0.1133464  0.1082537
## 
## Results for correlation parameters 
##      Alpha   Estimate BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]     0 0.02928666 0.05043404 0.05267604 0.05501772 0.05267319
### (2) Matrix-adjusted estimating equations and GEE
### on binary outcome with nested exchangeable correlation structure

### MAEE
est_maee_ind_bin <- geemaee(
  y = sampleSWCRT$y_bin,
  X = X, id = id, Z = Z,
  family = "binomial",
  maxiter = 500, epsilon = 0.001,
  printrange = TRUE, alpadj = TRUE,
  shrink = "ALPHA", makevone = FALSE
)
print(est_maee_ind_bin)
## GEE for correlated data 
##  Number of Iterations: 6 
## Results for marginal mean parameters 
##      Beta  Estimate MB-stderr BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]    0 -1.264931 0.7292152  0.6558115  0.6889423  0.7239012  0.6888862
## [2,]    1 -1.030340 0.4602527  0.4253526  0.4480213  0.4719959  0.4479266
## 
## Results for correlation parameters 
##      Alpha  Estimate BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]     0 0.1306334 0.08925503 0.09331933 0.09757006 0.09326206
### GEE
est_uee_ind_bin <- geemaee(
  y = sampleSWCRT$y_bin,
  X = X, id = id, Z = Z,
  family = "binomial",
  maxiter = 500, epsilon = 0.001,
  printrange = TRUE, alpadj = FALSE,
  shrink = "ALPHA", makevone = FALSE
)
print(est_uee_ind_bin)
## GEE for correlated data 
##  Number of Iterations: 6 
## Results for marginal mean parameters 
##      Beta  Estimate MB-stderr BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]    0 -1.273321 0.7333238  0.6616293  0.6950886  0.7303935  0.6950624
## [2,]    1 -1.034997 0.4593608  0.4262109  0.4490046  0.4731156  0.4489339
## 
## Results for correlation parameters 
##      Alpha  Estimate BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]     0 0.1206901 0.08954629 0.09361527 0.09787038 0.09356217

cpgeeSWD() examples: cluster-period GEE for estimating the marginal mean and correlation parameters in cross-sectional SW-CRTs

The cpgeeSWD() function implements the cluster-period GEE developed for cross-sectional stepped wedge cluster randomized trials (SW-CRTs). It provides valid estimation and inference for the treatment effect and intraclass correlation parameters within the GEE framework, and is computationally efficient for analyzing SW-CRTs with large cluster sizes. The program currently only allows for a marginal mean model with discrete period effects and the intervention indicator without additional covariates. The program offers bias-corrected ICC estimates as well as bias-corrected sandwich variances for both the treatment effect parameter and the ICC parameters. The technical details of the cluster-period GEE approach are provided in (Li et al. 2022).

Simulated dataset with 12 clusters and 4 periods

We summarize the individual-level simulated SW-CRT data to cluster-period data and use the cpgeeSWD() function to estimate the marginal mean and correlation parameters on cluster-period means of binary outcome. We first transform the variables to get the cluster-period mean outcome y_cp, mean parameters’ design matrix X_cp as well as other arguments.

########################################################################
### Example 1): simulated SW-CRT with smaller cluster-period sizes (5~10)
########################################################################

sampleSWCRT <- sampleSWCRTSmall

### cluster-period id, period, outcome, and design matrix
### id, period, outcome
id <- sampleSWCRT$id
period <- sampleSWCRT$period
y <- sampleSWCRT$y_bin
X <- as.matrix(sampleSWCRT[, c("period1", "period2", "period3", "period4", "treatment")])

m <- as.matrix(table(id, period))
n <- dim(m)[1]
t <- dim(m)[2]
clp_mu <- tapply(y, list(id, period), FUN = mean)
y_cp <- c(t(clp_mu))

### design matrix for correlation parameters
trt <- tapply(X[, t + 1], list(id, period), FUN = mean)
trt <- c(t(trt))

time <- tapply(period, list(id, period), FUN = mean)
time <- c(t(time))
X_cp <- matrix(0, n * t, t)

s <- 1
for (i in 1:n) {
  for (j in 1:t) {
    X_cp[s, time[s]] <- 1
    s <- s + 1
  }
}
X_cp <- cbind(X_cp, trt)
id_cp <- rep(1:n, each = t)
m_cp <- c(t(m))

We implement the cpgeeSWD() function on all the three choices of the correlation structure including "exchangeable", "nest_exch" and "exp_decay". We consider both matrix-adjusted estimating equations (MAEE) with alpadj = TRUE and uncorrected generalized estimating equations (GEE) with alpadj = FALSE.

### cluster-period matrix-adjusted estimating equations (MAEE)
### with exchangeable, nested exchangeable and exponential decay correlation structures
# exponential
est_maee_exc <- cpgeeSWD(
  y = y_cp, X = X_cp, id = id_cp,
  m = m_cp, corstr = "exchangeable",
  alpadj = TRUE
)
print(est_maee_exc)
## GEE and MAEE for Cluster-Period Summaries 
##  Number of Iterations: 4 
## Results for marginal mean parameters 
##      Beta   Estimate MB-stderr BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]    0 -0.7678066 0.2703014  0.2394340  0.2506194  0.2623375  0.2575649
## [2,]    1 -0.4899672 0.2929374  0.3175202  0.3415573  0.3679292  0.3465760
## [3,]    2 -0.6692544 0.3719364  0.5582006  0.6161580  0.6810928  0.6184389
## [4,]    3 -0.5967261 0.5257133  0.5630605  0.6178896  0.6788478  0.6378908
## [5,]    4 -0.9712041 0.4316492  0.5346929  0.5883089  0.6479608  0.5905832
## 
## Results for correlation parameters 
##      Alpha   Estimate BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]     0 0.07171243 0.03869076 0.04059179  0.0425914 0.04058122
# nested exchangeable
est_maee_nex <- cpgeeSWD(
  y = y_cp, X = X_cp, id = id_cp,
  m = m_cp, corstr = "nest_exch",
  alpadj = TRUE
)
print(est_maee_nex)
## GEE and MAEE for Cluster-Period Summaries 
##  Number of Iterations: 6 
## Results for marginal mean parameters 
##      Beta   Estimate MB-stderr BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]    0 -0.7721492 0.2931061  0.2388699  0.2498629  0.2613680  0.2533655
## [2,]    1 -0.5036505 0.3216055  0.3154306  0.3394624  0.3658386  0.3392457
## [3,]    2 -0.6217770 0.4152335  0.5684728  0.6271700  0.6927849  0.6233328
## [4,]    3 -0.5994803 0.5967744  0.5467135  0.6003143  0.6598640  0.6091679
## [5,]    4 -0.9861662 0.4905999  0.5305528  0.5839117  0.6431778  0.5821589
## 
## Results for correlation parameters 
##      Alpha   Estimate BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]     0 0.10820010 0.04714541 0.04951304 0.05200312 0.04952131
## [2,]     1 0.05385588 0.03727153 0.03907559 0.04097119 0.03907206
# exponential decay
est_maee_ed <- cpgeeSWD(
  y = y_cp, X = X_cp, id = id_cp,
  m = m_cp, corstr = "exp_decay",
  alpadj = TRUE
)
print(est_maee_ed)
## GEE and MAEE for Cluster-Period Summaries 
##  Number of Iterations: 7 
## Results for marginal mean parameters 
##      Beta   Estimate MB-stderr BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]    0 -0.7710180 0.2950154  0.2375293  0.2483319  0.2596306  0.2513006
## [2,]    1 -0.4988642 0.3224197  0.3145635  0.3388380  0.3654862  0.3397012
## [3,]    2 -0.6284311 0.4156078  0.5720337  0.6309674  0.6968561  0.6305171
## [4,]    3 -0.6140801 0.5986182  0.5461283  0.5997932  0.6594254  0.6124552
## [5,]    4 -0.9866725 0.4904297  0.5309814  0.5843712  0.6436174  0.5841529
## 
## Results for correlation parameters 
##      Alpha  Estimate BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]     0 0.1115235  0.0471122 0.04949153 0.05199521 0.04946812
## [2,]     1 0.6158576  0.1759633 0.18398863 0.19240603 0.18381541
### cluster-period GEE
### with exchangeable, nested exchangeable and exponential decay correlation structures

# exchangeable
est_uee_exc <- cpgeeSWD(
  y = y_cp, X = X_cp, id = id_cp,
  m = m_cp, corstr = "exchangeable",
  alpadj = FALSE
)
print(est_uee_exc)
## GEE and MAEE for Cluster-Period Summaries 
##  Number of Iterations: 4 
## Results for marginal mean parameters 
##      Beta   Estimate MB-stderr BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]    0 -0.7679011 0.2611220  0.2398530  0.2510783  0.2628390  0.2564986
## [2,]    1 -0.4911553 0.2852177  0.3154908  0.3393963  0.3656321  0.3424027
## [3,]    2 -0.6697603 0.3647602  0.5627720  0.6215708  0.6874968  0.6213395
## [4,]    3 -0.5940272 0.5205574  0.5631388  0.6182225  0.6795178  0.6340687
## [5,]    4 -0.9751028 0.4299957  0.5367262  0.5907718  0.6509467  0.5916886
## 
## Results for correlation parameters 
##      Alpha   Estimate BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]     0 0.05725857 0.03375491 0.03539291 0.03711574 0.03538186
# nested exchangeable
est_uee_nex <- cpgeeSWD(
  y = y_cp, X = X_cp, id = id_cp,
  m = m_cp, corstr = "nest_exch",
  alpadj = FALSE
)
print(est_uee_nex)
## GEE and MAEE for Cluster-Period Summaries 
##  Number of Iterations: 4 
## Results for marginal mean parameters 
##      Beta   Estimate MB-stderr BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]    0 -0.7701899 0.2728928  0.2396011  0.2507209  0.2623647  0.2544085
## [2,]    1 -0.4993131 0.2999219  0.3142455  0.3381255  0.3643392  0.3384140
## [3,]    2 -0.6424236 0.3867725  0.5686219  0.6278135  0.6940831  0.6244480
## [4,]    3 -0.5956930 0.5566382  0.5540211  0.6083847  0.6688445  0.6182629
## [5,]    4 -0.9836857 0.4595437  0.5345098  0.5883958  0.6483261  0.5870942
## 
## Results for correlation parameters 
##      Alpha   Estimate BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]     0 0.07500427 0.03992967 0.04189842 0.04396787 0.04190479
## [2,]     1 0.04814669 0.03293754 0.03450776 0.03615783 0.03449733
# exponential decay
est_uee_ed <- cpgeeSWD(
  y = y_cp, X = X_cp, id = id_cp,
  m = m_cp, corstr = "exp_decay",
  alpadj = FALSE
)
print(est_uee_ed)
## GEE and MAEE for Cluster-Period Summaries 
##  Number of Iterations: 4 
## Results for marginal mean parameters 
##      Beta   Estimate MB-stderr BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]    0 -0.7691313 0.2760920  0.2385661  0.2495298  0.2610036  0.2526378
## [2,]    1 -0.4960302 0.3027753  0.3135021  0.3375362  0.3639236  0.3382822
## [3,]    2 -0.6436889 0.3904936  0.5720324  0.6315468  0.6981797  0.6302718
## [4,]    3 -0.6064791 0.5635697  0.5533253  0.6077753  0.6683419  0.6196995
## [5,]    4 -0.9866876 0.4640753  0.5348939  0.5888701  0.6488594  0.5884695
## 
## Results for correlation parameters 
##      Alpha  Estimate BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]     0 0.0800789 0.04026846 0.04227077 0.04437711 0.04228027
## [2,]     1 0.7046177 0.17583729 0.18376072 0.19206554 0.18353049

Simulated dataset with 12 clusters and 5 periods

########################################################################
### Example 2): simulated SW-CRT with larger cluster-period sizes (20~30)
########################################################################

sampleSWCRT <- sampleSWCRTLarge

### cluster-period id, period, outcome, and design matrix
### id, period, outcome
id <- sampleSWCRT$id
period <- sampleSWCRT$period
y <- sampleSWCRT$y_bin
X <- as.matrix(sampleSWCRT[, c("period1", "period2", "period3", "period4", "period5", "treatment")])

m <- as.matrix(table(id, period))
n <- dim(m)[1]
t <- dim(m)[2]
clp_mu <- tapply(y, list(id, period), FUN = mean)
y_cp <- c(t(clp_mu))

### design matrix for correlation parameters
trt <- tapply(X[, t + 1], list(id, period), FUN = mean)
trt <- c(t(trt))

time <- tapply(period, list(id, period), FUN = mean)
time <- c(t(time))
X_cp <- matrix(0, n * t, t)

s <- 1
for (i in 1:n) {
  for (j in 1:t) {
    X_cp[s, time[s]] <- 1
    s <- s + 1
  }
}
X_cp <- cbind(X_cp, trt)
id_cp <- rep(1:n, each = t)
m_cp <- c(t(m))

### cluster-period matrix-adjusted estimating equations (MAEE)
### with exchangeable, nested exchangeable and exponential decay correlation structures
# exponential
est_maee_exc <- cpgeeSWD(
  y = y_cp, X = X_cp, id = id_cp,
  m = m_cp, corstr = "exchangeable",
  alpadj = TRUE
)
print(est_maee_exc)
## GEE and MAEE for Cluster-Period Summaries 
##  Number of Iterations: 4 
## Results for marginal mean parameters 
##      Beta    Estimate MB-stderr BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]    0 -0.03937150 0.1955148  0.2373442  0.2480663  0.2592915  0.2392651
## [2,]    1 -0.33407908 0.2022096  0.2200174  0.2352782  0.2517452  0.2396738
## [3,]    2 -0.67338220 0.2343484  0.2944134  0.3178251  0.3435471  0.3183472
## [4,]    3  0.04608714 0.2414272  0.2332480  0.2536047  0.2774156  0.2480103
## [5,]    4 -0.43352358 0.2901120  0.1710562  0.1877125  0.2079820  0.1965489
## [6,]    5 -0.82493433 0.1977208  0.2758767  0.3031771  0.3333531  0.3000391
## 
## Results for correlation parameters 
##      Alpha   Estimate BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]     0 0.07782417 0.03688003 0.03829676 0.03976975 0.03827983
# nested exchangeable
est_maee_nex <- cpgeeSWD(
  y = y_cp, X = X_cp, id = id_cp,
  m = m_cp, corstr = "nest_exch",
  alpadj = TRUE
)
print(est_maee_nex)
## GEE and MAEE for Cluster-Period Summaries 
##  Number of Iterations: 4 
## Results for marginal mean parameters 
##      Beta    Estimate MB-stderr BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]    0 -0.03432555 0.2278441  0.2393377  0.2499941  0.2611267  0.2479004
## [2,]    1 -0.34293068 0.2424034  0.2082967  0.2231294  0.2391131  0.2212194
## [3,]    2 -0.67730319 0.2887517  0.2713904  0.2930314  0.3167289  0.2876482
## [4,]    3  0.02825126 0.3212402  0.1938752  0.2119627  0.2334399  0.2046802
## [5,]    4 -0.42990720 0.3990553  0.1849263  0.2025603  0.2234539  0.2035330
## [6,]    5 -0.81687942 0.3038572  0.2835181  0.3107636  0.3407871  0.3069380
## 
## Results for correlation parameters 
##      Alpha  Estimate BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]     0 0.1201529 0.04650374 0.04831496 0.05019968 0.04830353
## [2,]     1 0.0584004 0.03424640 0.03557674 0.03696081 0.03558240
# exponential decay
est_maee_ed <- cpgeeSWD(
  y = y_cp, X = X_cp, id = id_cp,
  m = m_cp, corstr = "exp_decay",
  alpadj = TRUE
)
print(est_maee_ed)
## GEE and MAEE for Cluster-Period Summaries 
##  Number of Iterations: 7 
## Results for marginal mean parameters 
##      Beta    Estimate MB-stderr BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]    0 -0.03340439 0.2236591  0.2415963  0.2524580  0.2638106  0.2513261
## [2,]    1 -0.32963269 0.2361087  0.2096366  0.2232709  0.2378969  0.2223343
## [3,]    2 -0.66086972 0.2793325  0.2679038  0.2864419  0.3065960  0.2820858
## [4,]    3  0.05215247 0.3068537  0.2126799  0.2276517  0.2447418  0.2231794
## [5,]    4 -0.38679195 0.3815072  0.1392210  0.1482120  0.1597143  0.1537140
## [6,]    5 -0.85944259 0.2843752  0.2312996  0.2529042  0.2766009  0.2467741
## 
## Results for correlation parameters 
##      Alpha  Estimate BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]     0 0.1143454 0.04222592 0.04384679 0.04553347 0.04380728
## [2,]     1 0.7000112 0.14390913 0.15014145 0.15665442 0.15034373
### cluster-period GEE
### with exchangeable, nested exchangeable and exponential decay correlation structures

# exchangeable
est_uee_exc <- cpgeeSWD(
  y = y_cp, X = X_cp, id = id_cp,
  m = m_cp, corstr = "exchangeable",
  alpadj = FALSE
)
print(est_uee_exc)
## GEE and MAEE for Cluster-Period Summaries 
##  Number of Iterations: 3 
## Results for marginal mean parameters 
##      Beta    Estimate MB-stderr BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]    0 -0.03987191 0.1891780  0.2375607  0.2482853  0.2595112  0.2404060
## [2,]    1 -0.33454114 0.1959210  0.2189623  0.2341927  0.2506256  0.2376880
## [3,]    2 -0.67375572 0.2281604  0.2923708  0.3156737  0.3412787  0.3152896
## [4,]    3  0.04529970 0.2371119  0.2301168  0.2503246  0.2740128  0.2444258
## [5,]    4 -0.43427752 0.2857409  0.1708179  0.1874729  0.2077462  0.1952451
## [6,]    5 -0.82467526 0.1978609  0.2760276  0.3033256  0.3334993  0.3001943
## 
## Results for correlation parameters 
##      Alpha   Estimate BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]     0 0.07013385 0.03368912 0.03498342 0.03632918 0.03496968
# nested exchangeable
est_uee_nex <- cpgeeSWD(
  y = y_cp, X = X_cp, id = id_cp,
  m = m_cp, corstr = "nest_exch",
  alpadj = FALSE
)
print(est_uee_nex)
## GEE and MAEE for Cluster-Period Summaries 
##  Number of Iterations: 3 
## Results for marginal mean parameters 
##      Beta    Estimate MB-stderr BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]    0 -0.03499108 0.2169350  0.2392903  0.2499507  0.2610881  0.2477876
## [2,]    1 -0.34245090 0.2305101  0.2084051  0.2232474  0.2392423  0.2213878
## [3,]    2 -0.67698816 0.2749173  0.2716631  0.2933535  0.3171098  0.2879913
## [4,]    3  0.02904193 0.3057033  0.1942266  0.2123667  0.2339216  0.2050350
## [5,]    4 -0.43058289 0.3793522  0.1842784  0.2018893  0.2227843  0.2028661
## [6,]    5 -0.81733511 0.2886977  0.2833499  0.3106202  0.3406770  0.3067671
## 
## Results for correlation parameters 
##      Alpha   Estimate BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]     0 0.10499225 0.04215716 0.04379742 0.04550421 0.04378621
## [2,]     1 0.05350042 0.03080632 0.03200335 0.03324877 0.03200919
# exponential decay
est_uee_ed <- cpgeeSWD(
  y = y_cp, X = X_cp, id = id_cp,
  m = m_cp, corstr = "exp_decay",
  alpadj = FALSE
)
print(est_uee_ed)
## GEE and MAEE for Cluster-Period Summaries 
##  Number of Iterations: 4 
## Results for marginal mean parameters 
##      Beta    Estimate MB-stderr BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]    0 -0.03459151 0.2123822  0.2414782  0.2523416  0.2636966  0.2510466
## [2,]    1 -0.33001054 0.2238965  0.2100290  0.2237648  0.2385050  0.2228663
## [3,]    2 -0.66158744 0.2652147  0.2697183  0.2886630  0.3092792  0.2844552
## [4,]    3  0.05115744 0.2911203  0.2127361  0.2279738  0.2454327  0.2231941
## [5,]    4 -0.38956628 0.3612911  0.1403145  0.1497434  0.1617831  0.1547516
## [6,]    5 -0.85722995 0.2692278  0.2336074  0.2555548  0.2796454  0.2493893
## 
## Results for correlation parameters 
##      Alpha  Estimate BC0-stderr BC1-stderr BC2-stderr BC3-stderr
## [1,]     0 0.0989716 0.03850721 0.03997936 0.04151109 0.03993102
## [2,]     1 0.7279137 0.13726650 0.14328210 0.14956972 0.14351625

simbinPROBIT() example: generating correlated binary data using the multivariate probit method

The simbinPROBIT() function generates correlated binary data using the multivariate Probit method (Emrich and Piedmonte 1991). It simulates a vector of binary outcomes according the specified marginal mean vector and correlation structure. Constraints and compatibility between the marginal mean and correlation matrix are checked.

We use the simbinPROBIT() function to generate correlated binary data with different correlation structures. We consider simulating a cross-sectional SW-CRT dataset with 2 clusters, 3 periods with the same cluster-period size of 5. We use two mean vectors for the two clusters and specify the mu argument.

For the exchangeable correlation structure, we specify both the within-period and inter-period correlation parameters to be 0.015. We use 0.03 and 0.015 for the within-period and inter-period correlations, respectively. The exponential decay correlation structure has an decay parameter 0.8 with the within-period correlation parameter 0.03.

#### Simulate 2 clusters, 3 periods and cluster-period size of 5

t <- 3
n <- 2
m <- 5
# means of cluster 1
u_c1 <- c(0.4, 0.3, 0.2)
u1 <- rep(u_c1, c(rep(m, t)))
# means of cluster 2
u_c2 <- c(0.35, 0.25, 0.2)
u2 <- rep(u_c2, c(rep(m, t)))

# List of mean vectors
mu <- list()
mu[[1]] <- u1
mu[[2]] <- u2
# List of correlation matrices

## correlation parameters
alpha0 <- 0.03
alpha1 <- 0.015
rho <- 0.8

## (1) exchangeable
Sigma <- list()
Sigma[[1]] <- diag(m * t) * (1 - alpha1) + matrix(alpha1, m * t, m * t)

Sigma[[2]] <- diag(m * t) * (1 - alpha1) + matrix(alpha1, m * t, m * t)

y_exc <- simbinPROBIT(mu = mu, Sigma = Sigma, n = n)

## (2) nested exchangeable
Sigma <- list()
cor_matrix <- matrix(alpha1, m * t, m * t)
loc1 <- 0
loc2 <- 0
for (t in 1:t) {
  loc1 <- loc2 + 1
  loc2 <- loc1 + m - 1
  for (i in loc1:loc2) {
    for (j in loc1:loc2) {
      if (i != j) {
        cor_matrix[i, j] <- alpha0
      } else {
        cor_matrix[i, j] <- 1
      }
    }
  }
}

Sigma[[1]] <- cor_matrix
Sigma[[2]] <- cor_matrix

y_nex <- simbinPROBIT(mu = mu, Sigma = Sigma, n = n)

## (3) exponential decay

Sigma <- list()

### function to find the period of the ith index
region_ij <- function(points, i) {
  diff <- i - points
  for (h in 1:(length(diff) - 1)) {
    if (diff[h] > 0 & diff[h + 1] <= 0) {
      find <- h
    }
  }
  return(find)
}

cor_matrix <- matrix(0, m * t, m * t)
useage_m <- cumsum(m * t)
useage_m <- c(0, useage_m)

for (i in 1:(m * t)) {
  i_reg <- region_ij(useage_m, i)
  for (j in 1:(m * t)) {
    j_reg <- region_ij(useage_m, j)
    if (i_reg == j_reg & i != j) {
      cor_matrix[i, j] <- alpha0
    } else if (i == j) {
      cor_matrix[i, j] <- 1
    } else if (i_reg != j_reg) {
      cor_matrix[i, j] <- alpha0 * (rho^(abs(i_reg - j_reg)))
    }
  }
}

Sigma[[1]] <- cor_matrix
Sigma[[2]] <- cor_matrix

y_ed <- simbinPROBIT(mu = mu, Sigma = Sigma, n = n)

simbinCLF() example: generating correlated binary data using the conditional linear family method

The simbinCLF() function generates correlated binary data using the conditional linear family method (Qaqish 2003). It simulates a vector of binary outcomes according the specified marginal mean vector and correlation structure. Natural constraints and compatibility between the marginal mean and correlation matrix are checked.

We use the simbinCLF() function to generate correlated binary data with different correlation structures. We consider simulating a cross-sectional SW-CRT dataset with 2 clusters, 3 periods with the same cluster-period size of 5. We use two mean vectors for the two clusters and specify the mu argument.

For the exchangeable correlation structure, we specify both the within-period and inter-period correlation parameters to be 0.015. We use 0.03 and 0.015 for the within-period and inter-period correlations, respectively. The exponential decay correlation structure has an decay parameter 0.8 with the within-period correlation parameter 0.03.

##### Simulate 2 clusters, 3 periods and cluster-period size of 5

t <- 3
n <- 2
m <- 5

# means of cluster 1
u_c1 <- c(0.4, 0.3, 0.2)
u1 <- rep(u_c1, c(rep(m, t)))
# means of cluster 2
u_c2 <- c(0.35, 0.25, 0.2)
u2 <- rep(u_c2, c(rep(m, t)))

# List of mean vectors
mu <- list()
mu[[1]] <- u1
mu[[2]] <- u2
# List of correlation matrices

## correlation parameters
alpha0 <- 0.03
alpha1 <- 0.015
rho <- 0.8

## (1) exchangeable
Sigma <- list()
Sigma[[1]] <- diag(m * t) * (1 - alpha1) + matrix(alpha1, m * t, m * t)
Sigma[[2]] <- diag(m * t) * (1 - alpha1) + matrix(alpha1, m * t, m * t)
y_exc <- simbinCLF(mu = mu, Sigma = Sigma, n = n)

## (2) nested exchangeable
Sigma <- list()
cor_matrix <- matrix(alpha1, m * t, m * t)
loc1 <- 0
loc2 <- 0
for (t in 1:t) {
  loc1 <- loc2 + 1
  loc2 <- loc1 + m - 1
  for (i in loc1:loc2) {
    for (j in loc1:loc2) {
      if (i != j) {
        cor_matrix[i, j] <- alpha0
      } else {
        cor_matrix[i, j] <- 1
      }
    }
  }
}

Sigma[[1]] <- cor_matrix
Sigma[[2]] <- cor_matrix
y_nex <- simbinCLF(mu = mu, Sigma = Sigma, n = n)

## (3) exponential decay

Sigma <- list()

### function to find the period of the ith index
region_ij <- function(points, i) {
  diff <- i - points
  for (h in 1:(length(diff) - 1)) {
    if (diff[h] > 0 & diff[h + 1] <= 0) {
      find <- h
    }
  }
  return(find)
}

cor_matrix <- matrix(0, m * t, m * t)
useage_m <- cumsum(m * t)
useage_m <- c(0, useage_m)

for (i in 1:(m * t)) {
  i_reg <- region_ij(useage_m, i)
  for (j in 1:(m * t)) {
    j_reg <- region_ij(useage_m, j)
    if (i_reg == j_reg & i != j) {
      cor_matrix[i, j] <- alpha0
    } else if (i == j) {
      cor_matrix[i, j] <- 1
    } else if (i_reg != j_reg) {
      cor_matrix[i, j] <- alpha0 * (rho^(abs(i_reg - j_reg)))
    }
  }
}

Sigma[[1]] <- cor_matrix
Sigma[[2]] <- cor_matrix

y_ed <- simbinCLF(mu = mu, Sigma = Sigma, n = n)

Session Information

## R version 4.2.2 (2022-10-31)
## Platform: aarch64-apple-darwin20 (64-bit)
## Running under: macOS 14.2.1
## 
## Matrix products: default
## BLAS:   /Library/Frameworks/R.framework/Versions/4.2-arm64/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/4.2-arm64/Resources/lib/libRlapack.dylib
## 
## locale:
## [1] C/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] geeCRT_1.1.3
## 
## loaded via a namespace (and not attached):
##  [1] mvtnorm_1.1-3     digest_0.6.31     MASS_7.3-58.1     R6_2.5.1          jsonlite_1.8.4   
##  [6] rootSolve_1.8.2.3 evaluate_0.20     cachem_1.0.7      rlang_1.0.6       cli_3.6.0        
## [11] rstudioapi_0.14   jquerylib_0.1.4   bslib_0.4.2       rmarkdown_2.20    tools_4.2.2      
## [16] xfun_0.37         yaml_2.3.7        fastmap_1.1.1     compiler_4.2.2    htmltools_0.5.4  
## [21] knitr_1.42        sass_0.4.5

References

Emrich, Lawrence J, and Marion R Piedmonte. 1991. “A Method for Generating High-Dimensional Multivariate Binary Variates.” The American Statistician 45 (4): 302–4.
Li, Fan, Elizabeth L Turner, and John S Preisser. 2018. “Sample Size Determination for GEE Analyses of Stepped Wedge Cluster Randomized Trials.” Biometrics 74 (4): 1450–58.
Li, Fan, Hengshi Yu, Paul J Rathouz, Elizabeth L Turner, and John S Preisser. 2022. “Marginal Modeling of Cluster-Period Means and Intraclass Correlations in Stepped Wedge Designs with Binary Outcomes.” Biostatistics 23 (3): 772–88.
Preisser, John S, Bing Lu, and Bahjat F Qaqish. 2008. “Finite Sample Adjustments in Estimating Equations and Covariance Estimators for Intracluster Correlations.” Statistics in Medicine 27 (27): 5764–85.
Qaqish, Bahjat F. 2003. “A Family of Multivariate Binary Distributions for Simulating Correlated Binary Variables with Specified Marginal Means and Correlations.” Biometrika 90 (2): 455–63.