[R] Matched pairs with two data frames
Udo
ukoenig at med.uni-marburg.de
Mon Apr 21 08:32:26 CEST 2008
David,
tkanks für your comment, the code and the link.
You are right: "arbitrary" is a better word than "exact" pair matching.
I took the term "one-to-one exact matching" from the paper "MatchIt:
Nonparametric Preprocessing for Parametric Causal Inference" (p. 6):
http://gking.harvard.edu/matchit/docs/matchit.pdf
>Is it really the case that SPSS would give the output that you describe
>without any warnings about non-uniqueness?
My output described indeed causes the SPSS error message "Warning # duplicate
key in a file", however, the result is what I need (discarding the lines
with missing values in V3 and V4. But I will check this again with my
treat/control data from my example here.
Kind regards
Udo
Zitat von David Winsemius <dwinsemius at comcast.net>:
> Udo <ukoenig at med.uni-marburg.de> wrote in
> news:1208462659.4807ad43cea9d at webmail.med.uni-marburg.de:
>
> > Daniel,
> > thank you!
> >
> > I want to perfrom the simplest way of matching:
> > a one-to-one exact match (by age and school):
> > for every case in "treat" find ONE case (if there is one) in
> > "control" . The cases in "control" that could be matched, should be
> > tagged as not available or taken away (deleted) from the control
> > pool (thus, the used ones are not replaced).
> >
> > #treatment group
> > treat <- data.frame(age=c(1,1,2,2,2,4),
> > school=c(10,10,20,20,20,11),
> > out1=c(9.5,2.3,3.3,4.1,5.9,4.6))
> >
> > #control group
> > control <- data.frame(age=c(1,1,1,1,3,2),
> > school=c(10,10,10,10,33,20),
> > out2=c(1.1,2,3.5,4.9,5.2,6.5))
> >
> > #one-to-one exat matching-alorithmus ????
> >
> > matched.data.frame <- ?????
> >
> > In my example I matched the cases "by hand" to make things clear.
> > Case 1 from "treat" was matched with case 1 from "control",
> > 2 with 2 and 3 with 6. Case 4, 5 and 6 could not be matched,
> > because there is no "partner" in "control" .
> > Thus my matched example data frame has 3 cases.
>
> Is it really the case that SPSS would give the output that you describe
> without any warnings about non-uniqueness? How could they live with
> themselves after such arbitrary behavior? This link is evidence that
> SPSS may not behave as you allege.
> <http://kb.iu.edu/data/afit.html>
>
> If you really want to persist in what cannot possibly be called "one-
> to-one exact matching", but instead "arbitrary convenience matching",
> then you need to construct a function that sequentially marches through
> "treat", grabs the first match (perhaps with something like):
>
> > matched.first <- merge(treat[1,],control, by= c("age","school"))[1,]
> > matched.first
> age school out1 out2
> 1 1 10 9.5 1.1
>
> ... except that the "1"'s would be replaced with an index variable,
> then mark that control as "taken" perhaps by using all of the variables
> as identifiers, and then attempt match/marking for each successive case
> among ("taken" == FALSE") controls.
>
> --
> David Winsemius
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
--------------------------------------------
Udo K N G
Ö I
Clinic for Child an Adolescent Psychiatry
Philipps University of Marburg / Germany
More information about the R-help
mailing list