[R] Matched pairs with two data frames
dwinsemius at comcast.net
Fri Apr 18 15:29:16 CEST 2008
Udo <ukoenig at med.uni-marburg.de> wrote in
news:1208462659.4807ad43cea9d at webmail.med.uni-marburg.de:
> thank you!
> I want to perfrom the simplest way of matching:
> a one-to-one exact match (by age and school):
> for every case in "treat" find ONE case (if there is one) in
> "control" . The cases in "control" that could be matched, should be
> tagged as not available or taken away (deleted) from the control
> pool (thus, the used ones are not replaced).
> #treatment group
> treat <- data.frame(age=c(1,1,2,2,2,4),
> #control group
> control <- data.frame(age=c(1,1,1,1,3,2),
> #one-to-one exat matching-alorithmus ????
> matched.data.frame <- ?????
> In my example I matched the cases "by hand" to make things clear.
> Case 1 from "treat" was matched with case 1 from "control",
> 2 with 2 and 3 with 6. Case 4, 5 and 6 could not be matched,
> because there is no "partner" in "control" .
> Thus my matched example data frame has 3 cases.
Is it really the case that SPSS would give the output that you describe
without any warnings about non-uniqueness? How could they live with
themselves after such arbitrary behavior? This link is evidence that
SPSS may not behave as you allege.
If you really want to persist in what cannot possibly be called "one-
to-one exact matching", but instead "arbitrary convenience matching",
then you need to construct a function that sequentially marches through
"treat", grabs the first match (perhaps with something like):
> matched.first <- merge(treat[1,],control, by= c("age","school"))[1,]
age school out1 out2
1 1 10 9.5 1.1
... except that the "1"'s would be replaced with an index variable,
then mark that control as "taken" perhaps by using all of the variables
as identifiers, and then attempt match/marking for each successive case
among ("taken" == FALSE") controls.
More information about the R-help