**Summary:** In a block-randomized controlled trial, the
individuals in a study are divided based on important characteristics
(for example age group, sex, or smoking status). By stratifying the data
in this way prior to randomization, the researchers can reduce the
heterogeneity of their treatment and control groups, thereby reducing
the variance of their estimate of the effect of their treatment. The
`stratamatch`

package is designed to extend this approach to
the observational setting by providing functions to allow users to
easily divide an observational data set into strata of observations with
similar baseline characteristics. The treated and control individuals
within each stratum of the data can then by matched (for example, based
on propensity score), in order to recapitulate the block-randomized
trial from observational data. In order to select an effective
stratification scheme for the data, `stratamatch`

applies a
*pilot design* approach to estimate a quantity called the
prognostic score, and divides the data into blocks of individuals with
similar scores. The potential benefits of such an approach are twofold.
First, stratifying the data effectively allows for more computationally
efficient matching of large data sets. Second, early studies of the
prognostic score show that using the prognostic score to inform the
matching process may help reduce the variance in the estimated treatment
effect and increase the robustness of an observational study to bias
from unmeasured confounding factors.

This document contains the following sections

- Introduction and Background
- Generating an Example Data Set
- How to Stratify a Data Set
- Diagnostics for Stratified Data
- Matching Stratified Data

After briefly introducing the goals of stratamatch and the approaches it implements on a statistical level, this document will walk through an example of how stratamatch might be used on a toy data set. We will show how to stratify a data set, how to diagnose the quality of the stratification, and how to match a data set within strata.

In a fully-randomized controlled experiment, treatment assignments
are made independently of each individual’s baseline covariates,
allowing for unbiased estimation of the treatment effect. In
observational studies, researchers may attempt to account for
differences in their subject’s baseline covariates by matching treated
and control subjects with similar characteristics. In particular, the
popular approach of propenisty score matching pairs individuals who had
similar estimated probabilities of recieving the treatment based on
their baseline characteristics. By balancing the treated and control
groups in this way, the propensity score matching approach attempts to
coerce the data set into a form that resembles a randomized controlled
trial. Importantly, however, propensity score matching can only address
bias to due *measured* baseline covariates. Imbalances in
unmeasured baseline covariates may still bias the effect estimate after
propensity score matching. For this reason, researchers often carry out
a sensitivity analysis to address concerns about unmeasured
confounding.

However, the fully-randomized experiment is not the only experimental
study design. In a *block-randomized controlled experiment*,
subjects are stratified based on important covariates (e.g. sex, age
group, smoking status) before randomization into treatment groups. This
helps reduce the heterogeneity between treatment and control groups. In
the experimental context, reducing the heterogeneity between compared
groups helps to reduce the variance of the effect estimator. In
observational settings however, this reduction in heterogeneity has the
added benefit of reducing the sensitivity of the study results to
unobserved confounding (see Rosenbaum
(2005), for a great discussion on this point). Moreover, if the
sample size of an observational study is very large, stratifying the
sample and matching separately within the strata in this way may be much
faster than matching the entire sample at once. Stratamatch helps to
facilitate this process by supplying tools stratify an observational
dataset and match within each stratum, thereby emulating an
observational version of the block-randomized controlled trial.

Once we have decided to stratify our data set, the question becomes:
how should strata be determined? One option is to select the covariates
on which to stratify by hand based on expert knowledge. Another option
is to use knowledge from previous experiments to select the best
substrata. In the experimental setting, this may be done with a
*pilot study*. Using this approach, researchers set aside some of
their resources before running the main experiment for the purpose of
running a smaller, ‘pilot’ experiment. By examining the outcomes of the
pilot study, they can gather information which they can use to inform
the design of the main experiment. Importantly, after the pilot study is
run, the individuals in the pilot experiment are not reused in the main
experiment, so the observations of the outcomes from this earlier
analysis are not allowed to bias the results of the main study.

Aikens, Greaves, and Baiocchi (2020)
extend the idea of the pilot study to the observational setting. Using
an observational *pilot design*, the researchers may set aside a
‘pilot set’ of their data. Outcome information on this pilot set can be
observed, and the information gained can be used to inform the study
design. Subsequently, in order to preserve the separation of the study
design from the study analysis, the observations from the pilot set are
omitted from the main analysis.

Stratamatch uses a pilot design to estimate a quantity called the prognostic score (Ben B. Hansen (2008)), defined here as an individual’s expected outcome under the control assignment, based on their baseline covariates. Balancing observational data sets based on the prognostic score can the reduce heterogeneity between matched individuals, decreasing variance and diminishing the sensitivity of the study results to unobserved confounding (See Aikens, Greaves, and Baiocchi (2020), Antonelli et al. (2018), and Leacy and Stuart (2014)). Moreover, since the prognostic score is often continuous, strata can be easily determined using prognostic score quantiles to select evenly sized bins for the data. This circumvents common problems with stratification based on expert knowledge, since that process often generates strata which are too large, too small, or too poorly balanced between treatment and control observations.

The stratamatch function, `auto_stratify`

, carries out the
prognostic score stratification pilot design described above. Although
there are many additional options available when running this function,
the most basic procedure does the following:

Partition the data set into a pilot data set and an analysis data set

Fit a model for the prognostic score from the observations in the pilot set

Estimate prognostic scores for the analysis set using the prognostic model

Stratify the analysis set based on prognostic score quantiles.

Once the data set has been stratified, `stratamatch`

suggests several diagnostic plots and tables that can be used to assess
the size and balance of the strata produced. If the strata are
satisfactory, the treatment and control individuals can then be matched.
At this point, the reader may choose to match the data set within each
stratum using the `strata_match`

function, or they may select
and implement their own matching scheme.

We demonstrate the functionality of stratamatch with a toy example.
The `stratamatch`

package contains its own function,
`make_sample_data`

, for just this purpose.

```
library(stratamatch)
set.seed(125)
# make sample data set of 5000 observations
<- make_sample_data(n = 5000)
dat
# print the first few rows of the sample_data
head(dat)
```

```
## X1 X2 B1 B2 C1 treat outcome
## 1 0.93332697 0.22256993 1 1 b 0 0
## 2 -0.52503178 1.07623706 1 0 c 1 0
## 3 1.81443979 2.83989785 1 1 b 0 0
## 4 0.08304562 0.04686229 0 0 a 1 0
## 5 0.39571880 0.42609036 0 0 b 0 0
## 6 -2.19366962 1.28452442 1 1 c 0 1
```

Let’s assume that the data in `dat`

are an observational
data set, and we would like to estimate the effect of some binary
treatment on a binary outcome. Suppose the column `treat`

in
this data gives the treatment assigment (where a `0`

means
untreated and a `1`

means treated), and the column
`outcome`

gives information on who had the outcome of
interest and who didn’t (similarly, let’s call `0`

the
negative outcome and `1`

the positive outcome). However,
since this is an observational data set, we didn’t get to choose who got
the treatment and who didn’t. Instead, individuals self-selected into
treatment or control groups, perhaps in some way which is influenced by
their background characteristics. Additionally, an individual’s
probability of having the positive outcome may be influenced by the
treatment or by other baseline factors, but we don’t know which. In
addition to treatment assignments and outcomes, we have measured five
baseline covariates for each individual: `X1`

,
`X2`

, `B1`

, `B2`

, and `C1`

.
`X1`

and `X2`

are continuous, `B1`

and
`B2`

are binary, and `C1`

is a categorical
variable which takes on possible values `"a"`

,
`"b"`

, and `"c"`

.

We should also note that `dat`

is a fairly large data set
(5000 observations), but only about 1/5 of the individuals in the data
set recieved the treatment.

Suppose we wanted to stratify the data manually based on covariates
that our experts have selected. Importantly, we can only stratify the
data based on categorical or binary variables. This means that we cannot
use `X1`

or `X2`

directly (this would cause
`manual_stratify`

to throw an error). Below, I stratify the
data set based on `C1`

, `B1`

, and
`B2`

.

```
# manually stratify dat based on categorial and binary variables
<- manual_stratify(data = dat, treat ~ B1 + B2 + C1)
m.strat
# try printing the result
m.strat
```

```
## manual_strata object from package stratamatch.
##
## Function call:
## manual_stratify(data = dat, strata_formula = treat ~ B1 + B2 +
## C1)
##
## Analysis set dimensions: 5000 X 8
```

`summary(m.strat)`

```
## Number of strata: 12
##
## Min size: 33 Max size: 1533
##
## Strata with Potential Issues: 8
```

Above, we see that `manual_stratify`

returns a
`manual strata`

object. The printed summary above shows that
`manual_stratify`

has divided our dataset according to our
instructions and produced 12 strata.

We can extract important information from `m.strat`

with
`$`

.

The most important piece of `m.strat`

is the analysis set.
This is the stratified data which we can later match and use for effect
estimation.

```
# show the first few rows of the stratified data set
head(m.strat$analysis_set)
```

```
## # A tibble: 6 × 8
## X1 X2 B1 B2 C1 treat outcome stratum
## <dbl> <dbl> <int> <int> <chr> <int> <int> <int>
## 1 0.933 0.223 1 1 b 0 0 11
## 2 -0.525 1.08 1 0 c 1 0 9
## 3 1.81 2.84 1 1 b 0 0 11
## 4 0.0830 0.0469 0 0 a 1 0 1
## 5 0.396 0.426 0 0 b 0 0 2
## 6 -2.19 1.28 1 1 c 0 1 12
```

Since we didn’t use a pilot design for this example, our analysis set
just has all of the same data in `dat`

that we put in to
`manual_stratify`

. The only difference is that now there is
an additional column, `stratum`

, which says which stratum
each individual has been sorted into.

In contrast to `manual_stratify`

, the
`auto_stratify`

function automatically creates the stratified
data set based on estimated prognostic scores. In this process, we
specify what model we want to use for the prognostic score (the
`prognosis`

argument) and what percent of the control reserve
we want to use to do the fitting (the `pilot_fraction`

)
argument. We also need to tell `auto_stratify`

which column
of our data set designates the treatment assignment
(`treat`

). The final argument, `size,`

specifies
about how large we would like our strata to be. I pick
`size = 400`

here so that we get roughly the same number of
strata as when we manually stratified our data set.

In practice, `pilot_fraction`

and `size`

have
defaults, so we don’t need to specify them every time. Instead of
`pilot_fraction`

, we could specify a `pilot_size`

to get a pilot set of approximately that number of observations. Run
`help(auto_stratify)`

for more information.

```
<- auto_stratify(dat, treat = "treat", prognosis = outcome ~ X1 + X2,
a.strat pilot_fraction = 0.1, size = 400)
```

`## Constructing a pilot set by subsampling 10% of controls.`

`## Fitting prognostic model via logistic regression: outcome ~ X1 + X2`

```
# print and summarize the result from running auto_stratify
a.strat
```

```
## auto_strata object from package stratamatch.
##
## Function call:
## auto_stratify(data = dat, treat = "treat", prognosis = outcome ~
## X1 + X2, size = 400, pilot_fraction = 0.1)
##
## Analysis set dimensions: 4619 X 8
##
## Pilot set dimensions: 381 X 7
##
## Prognostic Score Formula:
## outcome ~ X1 + X2
```

`summary(a.strat)`

```
## Number of strata: 12
##
## Min size: 384 Max size: 385
##
## Strata with Potential Issues: 2
```

Our call to `auto_stratify`

has created
`auto_strata`

object. An `auto_strata`

object is
much like a `manual_strata`

object, except that it contains
somewhat more information, since the stratification process was done
differently. Importantly, `auto_strata`

has partitioned 10%
of the individuals in the control group to be in the pilot set. These
individuals were used to build the prognostic score, and were extracted
from the analysis set. In order to prevent overfitting, the observations
in the pilot set should not be included in later analyses, and should
not be reused when making the final effect estimate.

We can access the pilot set and the analysis set using `$`

with `a.strat`

. The command `a.strat$analysis_set`

gives the analysis set and `a.strat$pilot_set`

gives the
pilot set. We can also view the prognostic score estimates for the
individuals in the analysis set with
`a.strat$prognostic_scores`

.

Suppose you’re interested in partitioning your data into a pilot set
and an analysis set, but you’d like to do other things with it after
that - for example perhaps you’d like to fit a more complicated model
for prognostic score than a logistic regression. The
`split_pilot_set`

will split your pilot and analysis set for
you based on specified preferences. For more information, run
`help(split_pilot_set)`

.

Above, we showed the default method used by
`auto_stratify`

. By providing other arguments, it is possible
to construct the strata using various other methods. For example, one
could fit the prognostic scores for the analysis set separately and
provide them as an argument to `auto_stratify`

.
Alternatively, the researcher could specify their own pilot set that
they would like to use to build the prognostic model. Run
`help(auto_stratify)`

to see all of the possible inputs to
this function.

Both `manual_stratify`

and `auto_stratify`

objects have a strata table, which shows the rules defining each
stratum. Just like when we extracted the analysis set, we can extract
the `strata_table`

with `$`

. Below, I show the
strata table for `m.strat`

, our manually stratified data.
Each row describes the definitions used to make one stratum. For
example, below we see that stratum 1 contains all individuals who have
`B1 = 0`

and `B2 = 0`

, and `C1 = "a"`

:

```
# get strata table for manually stratified data set
$strata_table m.strat
```

```
## # A tibble: 12 × 5
## # Groups: B1, B2 [4]
## B1 B2 C1 stratum size
## <int> <int> <chr> <int> <int>
## 1 0 0 a 1 359
## 2 0 0 b 2 360
## 3 0 0 c 3 1492
## 4 0 1 a 4 39
## 5 0 1 b 5 51
## 6 0 1 c 6 150
## 7 1 0 a 7 175
## 8 1 0 b 8 40
## 9 1 0 c 9 33
## 10 1 1 a 10 1533
## 11 1 1 b 11 383
## 12 1 1 c 12 385
```

The strata table for an automatically stratified data set is somewhat different:

```
# show strata table for the automatically stratified data
$strata_table a.strat
```

```
## # A tibble: 12 × 3
## stratum quantile_bin size
## <int> <chr> <int>
## 1 1 [0.0461,0.241) 385
## 2 2 [0.2409,0.307) 385
## 3 3 [0.3068,0.352) 385
## 4 4 [0.3524,0.400) 385
## 5 5 [0.4004,0.445) 385
## 6 6 [0.4454,0.482) 385
## 7 7 [0.4825,0.526) 385
## 8 8 [0.5264,0.569) 385
## 9 9 [0.5685,0.615) 385
## 10 10 [0.6149,0.668) 385
## 11 11 [0.6676,0.735) 385
## 12 12 [0.7353,0.946] 384
```

The strata table for `a.strat`

shows what prognostic score
quantiles were used to divide the strata. For example, individuals in
stratum 1 have the lowest prognostic scores. This means that, if the
individuals in stratum 1 were in the control group, we would predict
them to have a very low probability of having the positive outcome,
based solely on their values of `X1`

and `X2`

. The
a vector of the cutoff values used to divide the strata can be extracted
from `a.strat`

with `extract_cut_points`

.

Another important diagnostic for both manually and automatically
stratified data is the `issue_table`

.

```
# issue table for manually stratified data
$issue_table m.strat
```

```
## # A tibble: 12 × 6
## Stratum Treat Control Total Control_Proportion Potential_Issues
## <int> <int> <int> <int> <dbl> <chr>
## 1 1 125 234 359 0.652 none
## 2 2 105 255 360 0.708 none
## 3 3 451 1041 1492 0.698 none
## 4 4 2 37 39 0.949 Too few samples; Small treat:…
## 5 5 3 48 51 0.941 Too few samples; Small treat:…
## 6 6 12 138 150 0.92 Small treat:control ratio
## 7 7 104 71 175 0.406 none
## 8 8 25 15 40 0.375 Too few samples
## 9 9 18 15 33 0.455 Too few samples
## 10 10 241 1292 1533 0.843 Small treat:control ratio
## 11 11 51 332 383 0.867 Small treat:control ratio
## 12 12 55 330 385 0.857 Small treat:control ratio
```

This table shows each of the strata that we obtained through manual
stratification and some of the potential issues we may face when trying
to match within these strata. In particular, this table highlights some
strata which are too small and/or have many more treated individuals
than controls. The `Total`

column shows the total number of
observations in each stratum. Some of the strata here are quite a bit
larger than others. In practice, data sets of about 4000 start to take a
long time to match, so that’s about the maximum size we would like for
our strata. In data sets of fewer than 100 individuals, it can be hard
to find good matches between our treated and control individuals.

One benefit of automatic stratification is that it tends to create
strata of roughly equal size, so that no strata are extremely small or
extremely large. We can see this by comparing the `Total`

column in the issue tables of the manually and automatically stratified
data sets:

```
# issue table for automatically stratified data
$issue_table a.strat
```

```
## # A tibble: 12 × 6
## Stratum Treat Control Total Control_Proportion Potential_Issues
## <int> <int> <int> <int> <dbl> <chr>
## 1 1 133 252 385 0.655 none
## 2 2 131 254 385 0.660 none
## 3 3 110 275 385 0.714 none
## 4 4 112 273 385 0.709 none
## 5 5 106 279 385 0.725 none
## 6 6 124 261 385 0.678 none
## 7 7 78 307 385 0.797 none
## 8 8 75 310 385 0.805 Small treat:control ratio
## 9 9 97 288 385 0.748 none
## 10 10 81 304 385 0.790 none
## 11 11 84 301 385 0.782 none
## 12 12 61 323 384 0.841 Small treat:control ratio
```

Plots are an important way that we can check how our statification
went. There are two types of plots that we can make automatically with
either a `manual_strata`

object or an
`auto_strata`

object. The first is a size-ratio plot. This
plots each stratum in the analysis set based on its size and the
percentage of control observations. If a stratum is too small or is very
imbalanced in its treat:control ratio, finding quality matches may be
difficult (shaded yellow areas). If a stratum is too large, matching may
be very computationally taxing (shaded red areas). We can see here that
a few strata from our manual stratification are quite small, and some
have nearly entirely control individuals.

```
# size-ratio plot for manually stratified data
plot(m.strat)
```

```
# plot(m.strat, type = "SR") will give the same output
# plot(m.strat, label = TRUE) will allow the user to click points to label them
```

We can compare this with a size-ratio plot from automatic stratification:

```
# size-ratio plot for automatically stratified data
plot(a.strat)
```

This plot clearly shows that the strata generated by auto_stratify are all about the same size. However, some of the strata still have a treat:control ratio much less than 1:1. Some of this is unavoidable because our data set naturally had a treat:control ratio of about 1:5. However, we should try to avoid having very extreme treat:control ratios in any of our strata.

The second plot we can make to diagnose issues both manually and
automatically stratified data is an overlap histogram. Below, I’ve made
an overlap histogram for all strata, and an overlap histogram for
stratum 3. This shows the propensity score distributions of the treated
and control populations. In order to make this histogram, we must tell
the `plot`

function what the propensity scores are for our
data set by specifying the `propensity`

argument. We have
three options: `propensity`

can be a propensity score formula
(which will then be fit on the analysis set), a `glm`

modeling propensity score, or a vector of propensity scores. Here, I
just pass in a formula.

```
# propensity score histograms for all strata from manually stratified data
plot(m.strat, type = "hist", propensity = treat ~ X2 + X1 + B1 + B2)
```

```
# propensity score histograms for all strata from manually stratified data
plot(m.strat, type = "hist", propensity = treat ~ X2 + X1 + B1 + B2, stratum = 3)
```

A similar command works for automatically stratified data. :

```
# propensity score histograms for stratum 3 from automatically stratified data
plot(a.strat, type = "hist", propensity = treat ~ X2 + X1 + B1 + B2, stratum = 3)
```

In addition to the two plot types above, there are some plots that
can be used to visualize `auto_strata`

objects only:
`"residual"`

and `"AC"`

plots.

Setting `type = "AC"`

in the plot command produces a
Assignment-Control plot. This shows each individual in terms of their
estimated propensity score and their estimated prognostic score (for
more details, see Aikens, Greaves, and Baiocchi
(2020)). This allows us to check how well the treated and control
samples overlap not only in their probability for treatment but in their
expected outcome under the control assignment. Just as with the overlap
histograms, we need to provide information on the propensity score.
Below, I use the same propensity formula as in the histogram above. The
first plot shows all of the strata, with lines indicating the barriers
between strata, while the second plot looks specifically at stratum 3 by
specifying the `stratum`

argument. Apparent striations or
groupings in assignment control plots like the ones below are common;
they appear when some binary or categorical covariate has a strong
association with either the outcome or the treatment assignment.

```
# make a Assignment-Control plot
plot(a.strat, type = "AC", propensity = treat ~ X1 + X2 + B1 + B2)
```

```
# make a Assignment-Control plot
plot(a.strat, type = "AC", propensity = treat ~ X1 + X2 + B1 + B2, stratum = 3)
```

Finally, if we fit a prognostic model, then we can run diagnostics on
that prognostic model with `type = "residual"`

. This is
essentially just a wrapper for the `glm`

plotting method. It
produces all the familiar diagnostic plots that we can obtain from
calling `plot`

on a fitted model from `glm`

.

```
# diagnostic plots for prognostic model
plot(a.strat, type = "residual")
```

`# plot(a.strat$prognostic_model) will do the same thing - see below`

The `plot`

command with `type = "residual"`

gives us a one-line command that shows us standard diagnostic plots for
the prognostic score model that we fit on the pilot set. If we want to
run extra diagnostics on our prognostic model, we can do that too. Our
`a.strat`

stores a copy of the prognostic score model, which
we can extract in the usual way with `$`

. By extracting the
prognostic model (`prognostic_model`

), we can run any of the
usual diagnostics for our fitted model.

```
# extract prognostic model from a.strat
<- a.strat$prognostic_model
progmod
# as an example, summarize model coefficients
summary(progmod)
```

```
##
## Call:
## glm(formula = prognostic_formula, family = "binomial", data = dat)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -1.7627 -1.0509 -0.6498 1.0808 1.9692
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -0.17106 0.15821 -1.081 0.280
## X1 -0.77962 0.12341 -6.318 2.66e-10 ***
## X2 0.09660 0.09498 1.017 0.309
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 527.73 on 380 degrees of freedom
## Residual deviance: 480.76 on 378 degrees of freedom
## AIC: 486.76
##
## Number of Fisher Scoring iterations: 4
```

Based on the coefficients from the prognostic model, we can tell that
our prognostic score estimates are primarily determined by each
individual’s `X2`

baseline value.

Now that we have stratified the data, we can match the individuals
within each stratum. Note that all of the examples that follow
additionally require the R package `optmatch`

(Ben B. Hansen and Klopfer (2006)) to be
installed.

The most flexible strategy - although occasionally more complex - is
for the researcher to do this step by hand, using the strata
designations in the analysis set returned by either one of the
stratification methods. If the researcher plans to do something very
specific or nuanced in the matching process, this may be the best
approach. For example, users proficient with the matching package
`optmatch`

(Ben B. Hansen and Klopfer
(2006)) will note that adding `+ strata(stratum)`

to
the matching formula supplied to optmatch will request that optmatch
perform the matching within each stratum in the analysis set (see the optmatch
documentation for further details). Another approach is to divide
the `analysis_set`

into separate data frames and match on
those individually, perhaps distributing over several computing clusters
or using a different matching scheme in each stratum (for example
different \(k\) in \(1:k\) matching)

If the goal is something more simple, e.g. a 1-to-1 or 1-to-\(k\) optimal propensity score matching
within strata, the `strata_match`

function can do that
automatically. Below, I match two control individuals to each treatment
individual within strata in my analysis set provided by a.strat. I use
the propensity score formula `treat ~ X1 + X2 + B1 + B2`

.

```
# match the automatically stratified data
<- strata_match(a.strat, treat ~ X1 + X2 + B1 + B2, k = 1) mymatch
```

`## Fitting propensity model: treat ~ X1 + X2 + B1 + B2`

`## `

```
# summarize matching results
summary(mymatch)
```

```
## Structure of matched sets:
## 1:1 0:1
## 1192 2235
## Effective Sample Size: 1192
## (equivalent number of matched pairs).
```

This function in turn calls `optmatch`

to do the matching,
stratified by the strata assignments. It returns an optmatch matching.
In essence, each individual is given an identification code for its
match. `<NA>`

indicates that the individual was not
matched, and an identifier such as `3.15`

indicates that this
individual was in stratum 3 and it was placed in match 15.

```
# add match information as a column in the data set
<- a.strat$analysis_set
matched_data $match <- as.character(mymatch)
matched_data
head(matched_data)
```

```
## X1 X2 B1 B2 C1 treat outcome stratum match
## 1 0.93332697 0.22256993 1 1 b 0 0 2 <NA>
## 2 -0.52503178 1.07623706 1 0 c 1 0 9 9.22
## 3 1.81443979 2.83989785 1 1 b 0 0 1 <NA>
## 4 0.08304562 0.04686229 0 0 a 1 0 5 5.82
## 5 -0.36031653 1.69169297 0 0 c 0 1 8 <NA>
## 6 0.14285392 0.68549042 1 1 a 0 0 6 <NA>
```

Effect estimation can then be done using the matched analysis set via researcher’s method of choice.

Aikens, Rachael C, Dylan Greaves, and Michael Baiocchi. 2020. “A
Pilot Design for Observational Studies: Using Abundant Data
Thoughtfully.” *Statistics in Medicine*.

Antonelli, Joseph, Matthew Cefalu, Nathan Palmer, and Denis Agniel.
2018. “Doubly Robust Matching Estimators for High Dimensional
Confounding Adjustment.” *Biometrics* 74 (4): 1171–79.

Hansen, Ben B. 2008. “The Prognostic Analogue of the Propensity
Score.” *Biometrika* 95 (2): 481–88.

Hansen, Ben B., and Stephanie Olsen Klopfer. 2006. “Optimal Full
Matching and Related Designs via Network Flows.” *Journal of
Computational and Graphical Statistics* 15 (3): 609–27.

Leacy, Finbarr P, and Elizabeth A Stuart. 2014. “On the Joint Use
of Propensity and Prognostic Scores in Estimation of the Average
Treatment Effect on the Treated: A Simulation Study.”
*Statistics in Medicine* 33 (20): 3488–508.

Rosenbaum, Paul R. 2005. “Heterogeneity and Causality: Unit
Heterogeneity and Design Sensitivity in Observational Studies.”
*The American Statistician* 59 (2): 147–52.