[R] Matching package - Match function

sunny sunayan at gmail.com
Sat Mar 26 04:39:27 CET 2011


I have figured out the solution to this. It appears that "distance" isn't
defined as the difference in propensity scores. Instead, this difference is
then scaled by the standard deviation of the predicted dependent-variable
values of the corresponding logistic regression, and then squared. 

Therefore, instead of checking for:

> summary(abs(logit.reg$fitted[match$index.treated]-logit.reg$fitted[match$index.control])) 
     Min.   1st Qu.    Median      Mean   3rd Qu.      Max. 
7.453e-13 2.959e-07 5.849e-07 5.842e-07 8.741e-07 1.167e-06 

the correct form is:

> summary(((logit.reg$fitted[match$index.treated]-logit.reg$fitted[match$index.control])/sd(logit.reg$fitted))^2)
     Min.   1st Qu.    Median      Mean   3rd Qu.      Max. 
4.078e-18 6.426e-07 2.512e-06 3.337e-06 5.609e-06 1.000e-05 

which then gives the expected maximum value of 1e-5, as set in the
distance.tolerance option.


sunny wrote:
> 
> Hi.
> 
> I am using the Matching package for propensity score matching. For each
> treated unit, I want to find all control units whose propensity scores lie
> within a certain distance from the treated unit. The sample code is as
> follows:
> 
>> library(Matching)
> 
>> x <- rnorm(100000)
>> y <- rnorm(100000)
>> z <- rbinom(100000,1,0.002)
> 
>> logit.reg <- glm(z~x+y,family=binomial(link='logit'))
> 
>> match <-
>> Match(Y=NULL,Tr=z,X=logit.reg$fitted,version='fast',ties=TRUE,M=1,distance.tolerance=1e-5)
> 
> According to the function definition
> (http://sekhon.berkeley.edu/matching/Match.html):
> 
> "distance.tolerance: This is a scalar which is used to determine if
> distances between two observations are different from zero. Values less
> than distance.tolerance are deemed to be equal to zero. This option can be
> used to perform a type of optimal matching"
> 
> Thus, for each treated unit I should get all control units whose
> difference in propensity scores from the treated unit is less than 1e-5.
> However, the actual difference between the treated unit's and the control
> units' propensity is distributed as follows:
> 
>> summary(abs(logit.reg$fitted[match$index.treated]-logit.reg$fitted[match$index.control]))
>      Min.   1st Qu.    Median      Mean   3rd Qu.      Max. 
> 7.453e-13 2.959e-07 5.849e-07 5.842e-07 8.741e-07 1.167e-06 
> 
> 


--
View this message in context: http://r.789695.n4.nabble.com/Matching-package-Match-function-tp3406144p3406919.html
Sent from the R help mailing list archive at Nabble.com.



More information about the R-help mailing list