# [R] Permutations in matched-pair study where combinations of pairs change

Luke Gaylor luke.gaylor at live.com.au
Mon Oct 31 13:54:27 CET 2016

```Friends,

Matched pairs studies are well documented, but what happens if we were to alter the manner in which events were paired to one another. Say we change the order of our data and pair without replacement, so that event 1 may pair with event 23 in one instance, but also event 36 or 102 in other instances.

Matched pair study can then be used to determine the effectiveness of, for example a treatment program, on a given outcome. The outcome may be continuous or variable. Therefore, a conditional logistic regression model can output an odds ratio (and confidence intervals for the treatment coefficient). Or we may use a paired t-test to assess if the difference in means between the pairs differ.

I am more favored to the logistic regression and output of the odds ratio, however it changes for each permutation. As too does its confidence interval. So my question is how can we report on a string of given odds ratios - how can be arrive at a statistical sound conclusion? Or accounting for the study design, where we achieve different combinations of matched pairs, how can be comment on the effectiveness of a given treatment program?

Below I have attached a code, which simulates a 100-fold permutation.

Any advice would be greatly appreciated,
Luke

Set.seed(123)
# prepare the data for the simulation
#1.
library(Matching)
library(survival)
library(dplyr)
#2.
require(doParallel)
cl<-makeCluster(2)
registerDoParallel(cl)
#3.
clusterEvalQ(cl,library(Matching))
clusterEvalQ(cl,library(survival))
clusterEvalQ(cl,library(dplyr))

# number of permutations
m <- 100

Result = foreach(i=1:m,.combine=cbind) %do%{

# taking from the example from the Matching package
#attach the data
data(lalonde)

# we want to assess if treatment shows an influence on a given outcome of interest
# lets create our hypothetical example
lalonde\$success <- with(lalonde, ifelse(re78 > 8125, 1, 0))

# adjust the order of the data
lalonde <- lalonde[sample(1:nrow(lalonde)), ]

# Define the covariates we want to match on
Xmat = cbind(lalonde\$age, lalonde\$educ, lalonde\$married, lalonde\$nodegr)

# define crude matching
mgen1 <- Match(Y=NULL, Tr=lalonde\$treat,
X=Xmat,
exact=c(0,1,1,1),
replace=FALSE, ties=F)

summary(mgen1)

# obtain initial pair-combinations
matched <- rbind(lalonde[mgen1\$index.treated,], lalonde[mgen1\$index.control,])

# generate a dummy variable for our each pair of people
matched\$Pair_ID <- rep(1:length(mgen1\$index.treated),2)

# crude filtering
# set hard limits of age
# must be within +-2 years
Matched2 <- matched %>%
group_by(Pair_ID) %>%
filter(abs( age[treat == 1] - age[treat == 0]) <= 2)

# summary table
table(Matched2\$treat, Matched2\$success)

# so now we have a cohort of similar events
# each pair contains one treat and one non-treat event

#######################################################
#                                                     #
#     H E R E   W E   A R E   T R Y I N G    T O      #
#  A S S E S S   T H E    E F F E C T I V E N E S S   #
#   O F   T R E A T M E N T   T O    A C H I E V E    #
#               s U C C E S S                         #
#                                                     #
#######################################################

# One could implement a parametric difference of means test
# t-test - (as the outcome of interest is binary)
# to assess if differences between pairs change

diff_means<- t.test(Matched2\$success[Matched2\$treat==1], Matched2\$success[Matched2\$treat==0],
paired = T, alternative = "two.sided", conf.level = 0.95)
diff_means\$p.value

# output the series of p-values for each permutation

# ===========================================
# Or
# ===========================================

# obtain an estimated effectiveness value from a conditional logistic regression
# namely, the Odds Ratio coefficient for 'treat'
model_1 <- clogit(success ~ treat + strata(Pair_ID), matched)
summary(model_1)

OR_M1 <- exp(model_1\$coefficients)

# we can save the confidence intervals for the 'treat' effectiveness
CI_U1 <- exp(confint(model_1))[1,2]
CI_L1 <- exp(confint(model_1))[1,1]

# then we could output any of the following
#Result <- diff_means\$p.value
#Result <- OR_M1
#Result <- rbind(OR_M1, CI_U1, CI_L1)

Result <- OR_M1

}

summary(Result[1,])

[[alternative HTML version deleted]]

```