[R] Propensity score matching with MatchIt

Suparna Mitra suparna.mitra.sm at gmail.com
Mon Jan 11 11:00:33 CET 2016


Hello R experts,
I am trying to do Propensity score matching for a medical data with two
types of surgery.
But somehow I am getting Summary of balance for all data and the matched
data exactly similar resulting the Percent Balance Improvement as zero.

> surgery.data<-read.csv(file.choose(), head = TRUE)
> surgery.data
   Sample Surgerytype Age ASAgrade  BMI FIGOstage PreviousAbdoSurgery
1       2           1  41        1 22.3         3                   0
2       4           1  49        2 19.5         3                   0
3       5           1  58        2 28.8         3                   0
4       8           1  34        1 29.1         3                   0
5       9           1  49        1 25.1         3                   0
6      13           1  30        2 29.0         3                   0
7      14           1  31        1 23.6         3                   0
8      15           1  29        1 33.7         3                   2
9      20           1  25        1 24.6         3                   0
10     28           1  28        1 21.0         3                   0
11     29           1  29        2 21.4         3                   0
12     30           1  61        1 25.2         3                   3
13     32           1  48        1 22.7         3                   0
14     33           1  24        1 26.1         3                   3
15     34           1  39        1 23.7         3                   0
16     36           1  39        2 34.6         3                   1
17     37           1  68        2 27.0         3                   0
18     49           1  71        2 30.8         3                   3
19     50           1  73        2 25.8         3                   0
20     54           1  30        2 23.1         3                   0
21     65           1  45        2 34.6         3                   0
22     77           1  41        1 29.8         3                   3
23     82           1  41        2 33.8         3                   0
24     86           1  34        1 34.7         3                   0
25     87           1  28        2 21.4         3                   0
26     88           1  35        1 25.5         3                   2
27     89           1  46        1 31.9         3                   1
28     91           1  48        2 20.7         3                   0
29     92           1  28        2 22.4         3                   2
30     96           1  45        1 22.7         3                   1
31     97           1  39        2 19.7         3                   1
32     98           1  34        1 27.6         3                   2
33    101           1  41        1 22.5         3                   0
34    107           1  31        2 31.0         3                   0
35    113           1  51        2 33.2         3                   0
36    114           1  43        2 22.5         3                   2
37      6           0  50        1 22.9         3                   0
38      7           0  43        2 25.6         3                   0
39     11           0  43        1 23.8         3                   2
40     12           0  31        1 22.0         3                   0
41     16           0  31        1 27.2         3                   2
42     17           0  34        1 19.6         3                   0
43     18           0  56        3 25.2         3                   0
44     21           0  39        1 26.6         3                   0
45     25           0  64        2 24.5         3                   0
46     45           0  61        1 21.9         3                   0
47     47           0  64        1 28.5         3                   0
48     53           0  54        2 26.8         5                   0
49     55           0  40        1 23.1         3                   0
50     57           0  46        1 26.2         3                   3
51     59           0  34        1 21.5         3                   0
52     62           0  25        2 23.8         3                   0
53     63           0  56        2 24.6         3                   0
54     64           0  45        1 24.2         3                   0
55     66           0  42        1 30.4         3                   0
56     67           0  49        2 35.8         2                   0
57     69           0  63        1 24.7         3                   0
58     70           0  29        1 29.7         5                   0
59     71           0  39        1 19.9         3                   3
60     73           0  62        1 28.0         3                   0
61     74           0  24        1 26.7         3                   0
62     75           0  70        2 31.2         3                   4
63     76           0  42        2 23.0         3                   0
64     79           0  56        1 34.9         3                   0
65     81           0  40        1 25.0         3                   0
66     83           0  39        2 29.6         3                   4
67     84           0  58        1 22.1         1                   0
68    104           0  36        1 28.6         3                   0
69    105           0  37        1 31.2         3                   0
70    109           0  33        1 25.0         3                   0
71    110           0  37        1 25.8         3                   0
72    111           0  34        1 21.0         3                   2
> m.out1 <- matchit(Surgerytype ~ Age + ASAgrade + BMI + FIGOstage +
PreviousAbdoSurgery, data = surgery.data, method = "nearest", distance =
"logit")
> summary(m.out1) # check balance

Call:
matchit(formula = Surgerytype ~ Age + ASAgrade + BMI + FIGOstage +
    PreviousAbdoSurgery, data = surgery.data, method = "nearest",
    distance = "logit")

Summary of balance for all data:
                    Means Treated Means Control SD Control Mean Diff eQQ
Med eQQ Mean eQQ Max
distance                   0.5426        0.4574     0.1429    0.0853
 0.0913   0.0867  0.1686
Age                       41.2778       44.6111    12.2528   -3.3333
 4.0000   4.1111 10.0000
ASAgrade                   1.5000        1.3056     0.5248    0.1944
 0.0000   0.2500  1.0000
BMI                       26.4194       25.8500     3.8345    0.5694
 0.8500   1.1472  3.5000
FIGOstage                  3.0000        3.0278     0.6088   -0.0278
 0.0000   0.1944  2.0000
PreviousAbdoSurgery        0.7222        0.5556     1.2058    0.1667
 0.0000   0.2778  2.0000


Summary of balance for matched data:
                    Means Treated Means Control SD Control Mean Diff eQQ
Med eQQ Mean eQQ Max
distance                   0.5426        0.4574     0.1429    0.0853
 0.0913   0.0867  0.1686
Age                       41.2778       44.6111    12.2528   -3.3333
 4.0000   4.1111 10.0000
ASAgrade                   1.5000        1.3056     0.5248    0.1944
 0.0000   0.2500  1.0000
BMI                       26.4194       25.8500     3.8345    0.5694
 0.8500   1.1472  3.5000
FIGOstage                  3.0000        3.0278     0.6088   -0.0278
 0.0000   0.1944  2.0000
PreviousAbdoSurgery        0.7222        0.5556     1.2058    0.1667
 0.0000   0.2778  2.0000

Percent Balance Improvement:
                    Mean Diff. eQQ Med eQQ Mean eQQ Max
distance                     0       0        0       0
Age                          0       0        0       0
ASAgrade                     0       0        0       0
BMI                          0       0        0       0
FIGOstage                    0       0        0       0
PreviousAbdoSurgery          0       0        0       0

Sample sizes:
          Control Treated
All            36      36
Matched        36      36
Unmatched       0       0
Discarded       0       0

But if I test separately for Age or BMI, I know there are differences in
these two groups. As results shows here:
> summary(lm(Age~Surgerytype,data=surgery.data))

Call:
lm(formula = Age ~ Surgerytype, data = surgery.data)

Residuals:
    Min      1Q  Median      3Q     Max
-20.611 -10.361  -2.278   7.722  31.722

Coefficients:
            Estimate Std. Error t value Pr(>|t|)
(Intercept)   37.944      4.663   8.138 1.02e-11 ***
Surgerytype    3.333      2.949   1.130    0.262
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 12.51 on 70 degrees of freedom
Multiple R-squared: 0.01792, Adjusted R-squared: 0.003895
F-statistic: 1.278 on 1 and 70 DF,  p-value: 0.2622


######
> summary(lm(BMI~Surgerytype,data=surgery.data))

Call:
lm(formula = BMI ~ Surgerytype, data = surgery.data)

Residuals:
   Min     1Q Median     3Q    Max
-6.919 -3.719 -0.850  2.698  9.950

Coefficients:
            Estimate Std. Error t value Pr(>|t|)
(Intercept)  26.9889     1.6089   16.77   <2e-16 ***
Surgerytype  -0.5694     1.0176   -0.56    0.578
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 4.317 on 70 degrees of freedom
Multiple R-squared: 0.004454, Adjusted R-squared: -0.009768
F-statistic: 0.3132 on 1 and 70 DF,  p-value: 0.5775

##Or a t-test for Age
> t.test(surgery.data$Age[surgery.data $Surgerytype ==1], surgery.data
$Age[surgery.data $Surgerytype ==2],paired=FALSE)

Welch Two Sample t-test

data:  surgery.data$Age[surgery.data$Surgerytype == 1] and
surgery.data$Age[surgery.data$Surgerytype == 2]
t = -1.1303, df = 69.883, p-value = 0.2622
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -9.215118  2.548451
sample estimates:
mean of x mean of y
 41.27778  44.61111

======
May be I am doing a silly mistake. Can anybody please help me?
Thanks a lot,
Mitra

	[[alternative HTML version deleted]]



More information about the R-help mailing list