[R] Matched pairs with two data frames

Udo ukoenig at med.uni-marburg.de
Wed Apr 16 11:58:46 CEST 2008


Patrick,
my intention was, to perform a one-to-one exact match, which pairs each treated
unit with ONE control unit (without replacement), using my two confounders
(age, school) for matching.


Patrick Connolly schrieb:
On Mon, 14-Apr-2008 at 08:37AM +0200, Udo wrote:

|> Zitat von Peter Alspach <PAlspach at hortresearch.co.nz>:
|>
|> > Udo
|> >
|> > Seems you might want merge()
|> >
|> > HTH .......
|> >
|> > Peter Alspach
|>
|> Thank you Peter and Jorge,
|>
|> but as I had written in my last sentence,
|> "Merge doesn´t do the job, because it makes
|> all possible matches", but maybe there is a sophisticated
|> solution with "merge", I could not bring light to.

>Maybe it would help if we knew what you mean by 'all' in this context.
>To get the NAs in your example, it is NECESSARY to use the all = TRUE
>argument.  Without the all = TRUE, the NA rows are omitted.

With 'all' I mean, that in the merged data frame (13 Obs) there are 8 cases
(2*4) with age=1 and school=10 (all possible combinations).

>What is it that you don't want in this:
I only "need" line 1, 6 and 9. To show this,
I added "needed" by hand.

   age school out1 out2	     needed
1    1     10  9.5  1.1      yes
2    1     10  9.5  2.0	     no
3    1     10  9.5  3.5	     no
4    1     10  9.5  4.9	     no
5    1     10  2.3  1.1	     no
6    1     10  2.3  2.0	     yes
7    1     10  2.3  3.5	     no
8    1     10  2.3  4.9	     no
9    2     20  3.3  6.5	     yes
10   2     20  4.1  6.5	     no
11   2     20  5.9  6.5	     no
12   3     33   NA  5.2	     no
13   4     11  4.6   NA	     no

>Whatever it is, can't you subset them out?
Yes, that´s the problem. To describe what I mean, I added the variable “needed”
by hand. I don´t know how to compute such a variable to subset.


My final data frame should look like this:
    age school out1 out2	nedded
1    1     10  9.5  1.1 	yes
6    1     10  2.3  2.0	        yes
9    2     20  3.3  6.5	        yes

I hope, I could make clear, what the problem ist and waht I mean.

An alternative would be using packages like “Matching” or “MatchIt”, which need
a “long” data structure with one data frame and not a “wide” one with two data
frames.


Many thanks!
Udo



More information about the R-help mailing list