[R] merge(join) problem

Ista Zahn izahn at psych.rochester.edu
Wed Aug 17 00:31:00 CEST 2011


On Tue, Aug 16, 2011 at 6:29 PM, Ista Zahn <izahn at psych.rochester.edu> wrote:
> Hi Tia,
>
> On Tue, Aug 16, 2011 at 6:00 PM, Sam Steingold <sds at gnu.org> wrote:
>> I have two datasets:
>> A with columns Open and Name (and many others, irrelevant to the merge)
>> B with columns Time and Name (and many others, irrelevant to the merge)
>>
>> I want the dataset AB with all these columns
>> Open from A - a difftime (time of day)
>> Time from B - a difftime (time of day)
>> Name (same in A & B) - a factor, does NOT index rows, i.e., there are
>> _many_ rows in both A & B with the same Name.
>> all the other columns from A & B.
>>
>> Each row in AB must come from exactly one row in A.
>> (i.e., dim(AB)[1] == dim(A)[1]).
>>
>> For each row in AB, Open>=Time, and "as small as possible".
>>
>> The above conditions uniquely define AB.
>>
>> The "obvious algorithm" is: for each row in A search B for a row
>> with the same Name and the largest Time <= Open.
>>
>> However, I don't see an easy way to do it in R.
>> The obvious intermediary step is
>>
>> AB1 <- merge(A, B, all.x = TRUE, all.y = FALSE, by = 'Name')
>>
>> Now, AB1 has many rows with the same Name and Open.
>> I need to drop all of them except for the one with the largest Time <= Open.
>> I can do
>>
>> AB2 <- AB1[which(AB1$Time <= AB1$Open),]
>>
>> Now I need to keep just _one_ row with the same Name & Open - and the
>> largest Time.
>
> Untested (your example was not reproducible) but how about
>
> AB3 <- AB2[order(AB$Time, decreasing=TRUE)
> AB4 <- AB3[!duplicated(AB3[c("Name", "Open")]), ]

oops, I mean
AB3 <- AB2[order(AB$Time, decreasing=TRUE), ]
AB4 <- AB3[!duplicated(AB3[c("Name", "Open")]), ]


>
> ?
>
> Best,
> Ista
>>
>> How do I do that?
>>
>> unique() seems to have the right name, but I don't see how it can help me...
>>
>> tia.
>>
>> --
>> Sam Steingold (http://sds.podval.org/) on CentOS release 5.6 (Final) X 11.0.60900031
>> http://jihadwatch.org http://honestreporting.com
>> http://ffii.org http://camera.org http://thereligionofpeace.com
>> UNIX is a way of thinking.  Windows is a way of not thinking.
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
>
> --
> Ista Zahn
> Graduate student
> University of Rochester
> Department of Clinical and Social Psychology
> http://yourpsyche.org
>



-- 
Ista Zahn
Graduate student
University of Rochester
Department of Clinical and Social Psychology
http://yourpsyche.org



More information about the R-help mailing list