[R] Matching in R
gustaf.rydevik at gmail.com
Mon Apr 27 12:41:51 CEST 2009
On Sun, Apr 26, 2009 at 6:22 PM, <dirk567 at gmx.de> wrote:
> Dear R users,
> I am trying to do exact matching on a large dataset (500.000 obs), about equal size of treatment and controll group, with replacement: As for the moment I use the "Match" function of the "Matching" library. I match on 2 covariates and all observations in the treatment group have at least one exact counterpart in the controllgroup. Now I want to introduce observation weights. I set ties=FALSE, as I want exactly one by one matching: Is there a way which makes that I draw randomly from the individuals in the controllgroup which have the same values of covariates as the individual in the treatmentgroup, setting the probabilities to be drawn proportional to the weights of the individual in the CT? E.g. I have three individuals which all have the same value for the covariates as the one observation I want to find a partner for, and the first of the three individuals has a very large weight: Now when drawing randomly among those three I want the probability that the first one is dr!
> awn to be very large.
> I'd really appreciate any suggestions: the "weights" option does not do the job, this seems to work only if setting "ties=TRUE"
You don't give a sample dataset, and I've not used the Matching
library, so take my comments with a scoop of salt.
Looking at the help page for Match, it seems as if the option
"Weight.matrix" is what you're looking for. creating a "weight" column
in the treatment group with a constant, high value, including "weight"
in the matching, and giving that covariate a high importance might
This matrix denotes the weights the matching algorithm uses when
weighting each of the covariates in X—see the Weight option. This
square matrix should have as many columns as the number of columns of
the X matrix. This matrix is usually provided by a call to the
GenMatch function which finds the optimal weight each variable should
be given so as to achieve balance on the covariates.
For most uses, this matrix has zeros in the off-diagonal cells. This
matrix can be used to weight some variables more than others. For
example, if X contains three variables and we want to match as best as
we can on the first, the following would work well:
> Weight.matrix <- diag(3)
> Weight.matrix[1,1] <- 1000/var(X[,1])
> Weight.matrix[2,2] <- 1/var(X[,2])
> Weight.matrix[3,3] <- 1/var(X[,3])
This code changes the weights implied by the inverse of the variances
by multiplying the first variable by a 1000 so that it is highly
weighted. In order to enforce exact matching see the exact and caliper
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
More information about the R-help