[R] Permutations in matched-pair study where combinations of pairs change

Mon Oct 31 23:22:25 CET 2016

You may get a reply here, but this post belongs on a statistics list
like stats.stackexchange.com, not r-help, which is concerned about R
programming issues rather than statistical methodology.

Cheers,
Bert

Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )

On Mon, Oct 31, 2016 at 5:54 AM, Luke Gaylor <luke.gaylor at live.com.au> wrote:
> Friends,
>
>
> Matched pairs studies are well documented, but what happens if we were to alter the manner in which events were paired to one another. Say we change the order of our data and pair without replacement, so that event 1 may pair with event 23 in one instance, but also event 36 or 102 in other instances.
>
> Matched pair study can then be used to determine the effectiveness of, for example a treatment program, on a given outcome. The outcome may be continuous or variable. Therefore, a conditional logistic regression model can output an odds ratio (and confidence intervals for the treatment coefficient). Or we may use a paired t-test to assess if the difference in means between the pairs differ.
>
> I am more favored to the logistic regression and output of the odds ratio, however it changes for each permutation. As too does its confidence interval. So my question is how can we report on a string of given odds ratios - how can be arrive at a statistical sound conclusion? Or accounting for the study design, where we achieve different combinations of matched pairs, how can be comment on the effectiveness of a given treatment program?
>
> Below I have attached a code, which simulates a 100-fold permutation.
>
>
> Any advice would be greatly appreciated,
> Luke
>
>
> Set.seed(123)
> # prepare the data for the simulation
> #1.
> library(Matching)
> library(survival)
> library(dplyr)
> #2.
> require(doParallel)
> cl<-makeCluster(2)
> registerDoParallel(cl)
> #3.
> clusterEvalQ(cl,library(Matching))
> clusterEvalQ(cl,library(survival))
> clusterEvalQ(cl,library(dplyr))
>
> # number of permutations
> m <- 100
>
>
> Result = foreach(i=1:m,.combine=cbind) %do%{
>
>   # taking from the example from the Matching package
>   #attach the data
>   data(lalonde)
>
>   # we want to assess if treatment shows an influence on a given outcome of interest
>   # lets create our hypothetical example
>   lalonde$success <- with(lalonde, ifelse(re78 > 8125, 1, 0))
>
>   # adjust the order of the data
>   lalonde <- lalonde[sample(1:nrow(lalonde)), ]
>   head(lalonde$age)
>
>
>   # Define the covariates we want to match on
>   Xmat = cbind(lalonde$age, lalonde$educ, lalonde$married, lalonde$nodegr)
>
>   # define crude matching
>   mgen1 <- Match(Y=NULL, Tr=lalonde$treat,
>                  X=Xmat,
>                  exact=c(0,1,1,1),
>                  replace=FALSE, ties=F)
>
>   summary(mgen1)
>
>   # obtain initial pair-combinations
>   matched <- rbind(lalonde[mgen1$index.treated,], lalonde[mgen1$index.control,])
>
>   # generate a dummy variable for our each pair of people
>   matched$Pair_ID <- rep(1:length(mgen1$index.treated),2)
>
>   # crude filtering
>   # set hard limits of age
>   # must be within +-2 years
>   Matched2 <- matched %>%
>     group_by(Pair_ID) %>%
>     filter(abs( age[treat == 1] - age[treat == 0]) <= 2)
>
>   # summary table
>   table(Matched2$treat, Matched2$success)
>
>   # so now we have a cohort of similar events
>   # each pair contains one treat and one non-treat event
>
>
>   #######################################################
>   #                                                     #
>   #     H E R E   W E   A R E   T R Y I N G    T O      #
>   #  A S S E S S   T H E    E F F E C T I V E N E S S   #
>   #   O F   T R E A T M E N T   T O    A C H I E V E    #
>   #               s U C C E S S                         #
>   #                                                     #
>   #######################################################
>
>
>   # One could implement a parametric difference of means test
>   # t-test - (as the outcome of interest is binary)
>   # to assess if differences between pairs change
>
>   diff_means<- t.test(Matched2$success[Matched2$treat==1], Matched2$success[Matched2$treat==0],
>          paired = T, alternative = "two.sided", conf.level = 0.95)
>   diff_means$p.value
>
>   # output the series of p-values for each permutation
>
>   # ===========================================
>   # Or
>   # ===========================================
>
>   # obtain an estimated effectiveness value from a conditional logistic regression
>   # namely, the Odds Ratio coefficient for 'treat'
>   model_1 <- clogit(success ~ treat + strata(Pair_ID), matched)
>   summary(model_1)
>
>   OR_M1 <- exp(model_1$coefficients[1])
>
>   # we can save the confidence intervals for the 'treat' effectiveness
>   CI_U1 <- exp(confint(model_1))[1,2]
>   CI_L1 <- exp(confint(model_1))[1,1]
>
>   # then we could output any of the following
>   #Result <- diff_means$p.value
>   #Result <- OR_M1
>   #Result <- rbind(OR_M1, CI_U1, CI_L1)
>
>   Result <- OR_M1
>
> }
>
> summary(Result[1,])
>
>
>
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.