Sampling weights (also known as survey weights) frequently appear when using large, representative datasets. They are required to ensure any estimated quantities generalize to a target population defined by the weights. Evidence suggests that sampling weights need to be incorporated into a propensity score matching analysis to obtain valid and unbiased estimates of the treatment effect in the sampling weighted population (DuGoff, Schuler, and Stuart 2014; Austin, Jembere, and Chiu 2016; Lenis et al. 2019). In this guide, we demonstrate how to use sampling weights with `MatchIt`

for propensity score estimation, balance assessment, and effect estimation. Fortunately, doing so is not complicated, but some care must be taken to ensure sampling weights are incorporated correctly. It is assumed one has read the other vignettes explaining matching (`vignette("matching-methods")`

), balance assessment (`vignette("assessing-balance")`

), and effect estimation (`vignette("estimating-effects")`

.

We will use the same simulated toy dataset used in `vignette("estimating-effects")`

except with the addition of a sampling weights variable, `SW`

, which is used to generalize the sample to a specific target population with a distribution of covariates different from that of the sample. Code to generate the covariates, treatment, and outcome is at the bottom of `vignette("estimating-effects")`

and code to generate the sampling weights is at the end of this document. We will consider the effect of binary treatment `A`

on continuous outcome `Y_C`

, adjusting for confounders `X1`

-`X9`

.

`head(d)`

```
## A X1 X2 X3 X4 X5 X6 X7 X8 X9 Y_C SW
## 1 0 0.1725 -1.4283 -0.4103 -2.36059 1 -1.1199 0.6398 -0.4840 -0.59385 -3.5907 1.675
## 2 0 -1.0959 0.8463 0.2456 -0.12333 1 -2.2687 -1.4491 -0.5514 -0.31439 -1.5481 1.411
## 3 0 0.1768 0.7905 -0.8436 0.82366 1 -0.2221 0.2971 -0.6966 -0.69516 6.0714 2.332
## 4 0 -0.4595 0.1726 1.9542 -0.62661 1 -0.4019 -0.8294 -0.5384 0.20729 2.4906 1.644
## 5 1 0.3563 -1.8121 0.8135 -0.67189 1 -0.8297 1.7297 -0.6439 -0.02648 -0.6687 2.722
## 6 0 -2.4313 -1.7984 -1.2940 0.04609 1 -1.2419 -1.1252 -1.8659 -0.56513 -9.8504 14.773
```

```
library("MatchIt")
library("lmtest")
library("sandwich")
```

When using sampling weights with propensity score matching, one has the option of including the sampling weights in the model used to estimate the propensity scores. Although evidence is mixed on whether this is required (Austin, Jembere, and Chiu 2016; Lenis et al. 2019), it can be a good idea. The choice should depend on whether including the sampling weights improves the quality of the matches. Specifications including and excluding sampling weights should be tried to determine which is preferred.

To supply sampling weights to the propensity score-estimating function in `matchit()`

, the sampling weights variable should be supplied to the `s.weights`

argument. It can be supplied either as a numerical vector containing the sampling weights, or a string or one-sided formula with the name of the sampling weights variable in the supplied dataset. Below we demonstrate including sampling weights into propensity scores estimated using logistic regression for optimal full matching for the average treatment effect in the population (ATE) (note that all methods and steps apply the same way to all forms of matching and all estimands).

```
<- matchit(A ~ X1 + X2 + X3 + X4 + X5 +
mF_s + X7 + X8 + X9, data = d,
X6 method = "full", distance = "glm",
estimand = "ATE", s.weights = ~SW)
mF_s
```

```
## A matchit object
## - method: Optimal full matching
## - distance: Propensity score
## - estimated with logistic regression
## - sampling weights included in estimation
## - number of obs.: 2000 (original), 2000 (matched)
## - sampling weights: present
## - target estimand: ATE
## - covariates: X1, X2, X3, X4, X5, X6, X7, X8, X9
```

Notice that the description of the matching specification when the `matchit`

object is printed includes lines indicating that the sampling weights were included in the estimation of the propensity score and that they are present in the `matchit`

object. It is stored in the `s.weights`

component of the `matchit`

object. Note that at this stage, the matching weights (stored in the `weights`

component of the `matchit`

object) do not incorporate the sampling weights; they are calculated simply as a result of the matching.

Now let’s perform full matching on a propensity score that does not include the sampling weights in its estimation. Here we use the same specification as was used in `vignette("estimating-effects")`

.

```
<- matchit(A ~ X1 + X2 + X3 + X4 + X5 +
mF + X7 + X8 + X9, data = d,
X6 method = "full", distance = "glm",
estimand = "ATE")
mF
```

```
## A matchit object
## - method: Optimal full matching
## - distance: Propensity score
## - estimated with logistic regression
## - number of obs.: 2000 (original), 2000 (matched)
## - target estimand: ATE
## - covariates: X1, X2, X3, X4, X5, X6, X7, X8, X9
```

Notice that there is no mention of sampling weights in the description of the matching specification. However, to properly assess balance and estimate effects, we need the sampling weights to be included in the `matchit`

object, even if they were not used at all in the matching. To do so, we use the function `add_s.weights()`

, which adds sampling weights to the supplied `matchit`

objects.

```
<- add_s.weights(mF, ~SW)
mF
mF
```

```
## A matchit object
## - method: Optimal full matching
## - distance: Propensity score
## - estimated with logistic regression
## - sampling weights not included in estimation
## - number of obs.: 2000 (original), 2000 (matched)
## - sampling weights: present
## - target estimand: ATE
## - covariates: X1, X2, X3, X4, X5, X6, X7, X8, X9
```

Now when we print the `matchit`

object, we can see lines have been added identifying that sampling weights are present but they were not used in the estimation of the propensity score used in the matching.

Note that not all methods can involve sampling weights in the estimation. Only methods that use the propensity score will be affected by sampling weights; coarsened exact matching or Mahalanobis distance optimal pair matching, for example, ignore the sampling weights, and some propensity score estimation methods, like `randomForest`

and `bart`

(as presently implemented), cannot incorporate sampling weights. Sampling weights should still be supplied to `matchit()`

even when using these methods to avoid having to use `add_s.weights()`

and remembering which methods do or do not involve sampling weights.

Now we need to decide which matching specification is the best to use for effect estimation. We do this by selecting the one that yields the best balance without sacrificing remaining effective sample size. Because the sampling weights are incorporated into the `matchit`

object, the balance assessment tools in `plot.matchit()`

and `summary.matchit()`

incorporate them into their output.

We’ll use `summary()`

to examine balance on the two matching specifications. With sampling weights included, the balance statistics for the unmatched data are weighted by the sampling weights. The balance statistics for the matched data are weighted by the product of the sampling weights and the matching weights. It is the product of these weights that will be used in estimating the treatment effect. Below we use `summary()`

to display balance for the two matching specifications. No additional arguments to `summary()`

are required for it to use the sampling weights; as long as they are in the `matchit`

object (either due to being supplied with the `s.weights`

argument in the call to `matchit()`

or to being added afterward by `add_s.weights()`

), they will be correctly incorporated into the balance statistics.

```
#Balance before matching and for the SW propensity score full matching
summary(mF_s, improvement = FALSE)
```

```
##
## Call:
## matchit(formula = A ~ X1 + X2 + X3 + X4 + X5 + X6 + X7 + X8 +
## X9, data = d, method = "full", distance = "glm", estimand = "ATE",
## s.weights = ~SW)
##
## Summary of Balance for All Data:
## Means Treated Means Control Std. Mean Diff. Var. Ratio eCDF Mean eCDF Max
## distance 0.470 0.225 1.200 1.533 0.281 0.467
## X1 0.185 -0.120 0.287 1.309 0.105 0.171
## X2 -0.565 -0.202 -0.368 0.854 0.108 0.194
## X3 -0.059 -0.057 -0.003 0.739 0.041 0.083
## X4 0.852 0.150 0.680 1.016 0.180 0.293
## X5 0.567 0.713 -0.307 . 0.146 0.146
## X6 0.169 -0.008 0.171 1.014 0.048 0.103
## X7 0.378 -0.089 0.474 1.205 0.130 0.210
## X8 -0.376 -0.142 -0.216 1.190 0.071 0.129
## X9 0.088 -0.001 0.090 0.945 0.027 0.066
##
##
## Summary of Balance for Matched Data:
## Means Treated Means Control Std. Mean Diff. Var. Ratio eCDF Mean eCDF Max Std. Pair Dist.
## distance 0.316 0.292 0.117 1.042 0.030 0.073 0.007
## X1 0.025 -0.035 0.057 1.211 0.038 0.086 0.974
## X2 -0.386 -0.274 -0.113 1.058 0.031 0.091 1.079
## X3 -0.101 -0.015 -0.092 0.656 0.046 0.109 1.133
## X4 0.353 0.338 0.014 1.186 0.023 0.068 0.910
## X5 0.669 0.667 0.003 . 0.002 0.002 0.830
## X6 0.043 0.061 -0.017 1.148 0.028 0.083 1.177
## X7 0.156 0.030 0.128 1.185 0.029 0.074 1.068
## X8 -0.171 -0.224 0.048 1.242 0.017 0.046 0.979
## X9 0.000 0.036 -0.037 1.139 0.026 0.086 1.146
##
## Sample Sizes:
## Control Treated
## All (ESS) 987.1 177.1
## All 1559. 441.
## Matched (ESS) 517.4 152.
## Matched 1559. 441.
## Unmatched 0. 0.
## Discarded 0. 0.
```

```
#Balance for the non-SW propensity score full matching
summary(mF, un = FALSE)
```

```
##
## Call:
## matchit(formula = A ~ X1 + X2 + X3 + X4 + X5 + X6 + X7 + X8 +
## X9, data = d, method = "full", distance = "glm", estimand = "ATE")
##
## Summary of Balance for Matched Data:
## Means Treated Means Control Std. Mean Diff. Var. Ratio eCDF Mean eCDF Max Std. Pair Dist.
## distance 0.285 0.275 0.051 0.995 0.021 0.057 0.009
## X1 -0.101 0.016 -0.110 1.082 0.030 0.066 0.932
## X2 -0.317 -0.250 -0.067 0.973 0.029 0.090 1.117
## X3 -0.037 -0.067 0.032 0.699 0.039 0.080 1.137
## X4 0.462 0.286 0.171 1.136 0.047 0.128 0.987
## X5 0.656 0.669 -0.026 . 0.012 0.012 0.806
## X6 0.036 -0.025 0.059 1.077 0.025 0.059 1.123
## X7 0.054 0.031 0.023 1.042 0.012 0.056 0.986
## X8 -0.200 -0.213 0.012 1.077 0.022 0.078 0.977
## X9 -0.026 0.042 -0.069 0.894 0.034 0.105 1.098
##
## Sample Sizes:
## Control Treated
## All (ESS) 987.1 177.1
## All 1559. 441.
## Matched (ESS) 589.1 152.7
## Matched 1559. 441.
## Unmatched 0. 0.
## Discarded 0. 0.
```

The results of the two matching specifications are similar. Balance appears to be slightly better when using the sampling weight-estimated propensity scores than when using the unweighted propensity scores. However, the effective sample size for the control group is larger when using the unweighted propensity scores. Neither propensity score specification achieves excellent balance, and more fiddling with the matching specification (e.g., by changing the method of estimating propensity scores, the type of matching, or the options used with the matching) might yield a better matched set. For the purposes of this analysis, we will move forward with the matching that used the sampling weight-estimated propensity scores (`mF_s`

) because of its superior balance. Some of the remaining imbalance may be eliminated by adjusting for the covariates in the outcome model.

Note that had we not added sampling weights to `mF`

, the matching specification that did not include the sampling weights, our balance assessment would be inaccurate because the balance statistics would not include the sampling weights. In this case, in fact, assessing balance on `mF`

without incorporated the sampling weights would have yielded radically different results and a different conclusion. It is critical to incorporate sampling weights into the `matchit`

object using `add_s.weights()`

even if they are not included in the propensity score estimation.

Estimating the treatment effect after matching is straightforward when using sampling weights. Effects are estimated in the same way as when sampling weights are excluded, except that the matching weights must be multiplied by the sampling weights to yield accurate, generalizable estimates. `match.data()`

and `get_matches()`

do this automatically, so the weights produced by these functions already are a product of the matching weights and the sampling weights. Note this will only be true if sampling weights are incorporated into the `matchit`

object.

Below we estimate the effect of `A`

on `Y_C`

in the matched and sampling weighted sample, adjusting for the covariates to improve precision and decrease bias.

```
<- match.data(mF_s)
md_F_s
<- lm(Y_C ~ A + X1 + X2 + X3 + X4 + X5 +
fit + X7 + X8 + X9, data = md_F_s,
X6 weights = weights)
coeftest(fit, vcov. = vcovCL, cluster = ~subclass)["A",,drop=FALSE]
```

```
## Estimate Std. Error t value Pr(>|t|)
## A 1.74 0.3815 4.56 5.43e-06
```

Note that `match.data()`

and `get_weights()`

have the option `include.s.weights`

, which, when set to `FALSE`

, makes it so the returned weights do not incorporate the sampling weights and are simply the matching weights. To incorporate sampling weights into the effect estimation with this option, one must multiply the returned weights by the sampling weights when including them in the outcome model estimation. We demonstrate this below:

```
<- match.data(mF_s, include.s.weights = FALSE)
md_F_s
<- lm(Y_C ~ A + X1 + X2 + X3 + X4 + X5 +
fit + X7 + X8 + X9, data = md_F_s,
X6 weights = weights * SW)
coeftest(fit, vcov. = vcovCL, cluster = ~subclass)["A",,drop=FALSE]
```

```
## Estimate Std. Error t value Pr(>|t|)
## A 1.74 0.3815 4.56 5.43e-06
```

We get the same estimates using this syntax as we did above. Because one might to forget to multiply the two sets of weights together, it is easier to just use the default of `include.s.weights = TRUE`

and ignore the sampling weights in the rest of the analysis (because they are already included in the returned weights).

```
#Generatng data similar to Austin (2009) for demonstrating
#treatment effect estimation with sampling weights
<- function(n) {
gen_X <- matrix(rnorm(9 * n), nrow = n, ncol = 9)
X 5] <- as.numeric(X[,5] < .5)
X[,
X
}
#~20% treated
<- function(X) {
gen_A <- - 1.2 + log(2)*X[,1] - log(1.5)*X[,2] + log(2)*X[,4] - log(2.4)*X[,5] +
LP_A log(2)*X[,7] - log(1.5)*X[,8]
<- plogis(LP_A)
P_A rbinom(nrow(X), 1, P_A)
}
# Continuous outcome
<- function(A, X) {
gen_Y_C 2*A + 2*X[,1] + 2*X[,2] + 2*X[,3] + 1*X[,4] + 2*X[,5] + 1*X[,6] + rnorm(length(A), 0, 5)
}#Conditional:
# MD: 2
#Marginal:
# MD: 2
<- function(X) {
gen_SW <- rbinom(nrow(X), 1, .3)
e 1/plogis(log(1.4)*X[,2] + log(.7)*X[,4] + log(.9)*X[,6] + log(1.5)*X[,8] + log(.9)*e +
-log(.5)*e*X[,2] + log(.6)*e*X[,4])
}
set.seed(19599)
<- 2000
n <- gen_X(n)
X <- gen_A(X)
A <- gen_SW(X)
SW
<- gen_Y_C(A, X)
Y_C
<- data.frame(A, X, Y_C, SW) d
```

Austin, Peter C., Nathaniel Jembere, and Maria Chiu. 2016. “Propensity Score Matching and Complex Surveys.” *Statistical Methods in Medical Research* 27 (4): 1240–57. https://doi.org/10.1177/0962280216658920.

DuGoff, Eva H., Megan Schuler, and Elizabeth A. Stuart. 2014. “Generalizing Observational Study Results: Applying Propensity Score Methods to Complex Surveys.” *Health Services Research* 49 (1): 284–303. https://doi.org/10.1111/1475-6773.12090.

Lenis, David, Trang Quynh Nguyen, Nianbo Dong, and Elizabeth A. Stuart. 2019. “It’s All about Balance: Propensity Score Matching in the Context of Complex Survey Data.” *Biostatistics* 20 (1): 147–63. https://doi.org/10.1093/biostatistics/kxx063.