[R] fuzzy merge
ravi
rv15i at yahoo.se
Wed Apr 9 10:53:00 CEST 2008
Hi,
I would like to merge two data frames. It is just that I want the merging to be done with some kind of a fuzzy criterion. Let me explain.
My first data frame looks like this :
ID1 time1 dt
1 2008-01-02 13:11 10
2 2008-01-02 14:20 20
3 2008-01-02 15:42 30
4 2008-01-02 16:45 40
5 2008-01-02 17:42 50
6 2008-01-02 20:40 60
My second data frame :
ID2 time2 d1
101 2008-01-02 14:29 75
102 2008-01-02 17:55 105
103 2008-02-07 20:01 8
I want the merging to be done such that time2 is in the range between time1 and (time1+15 min).
That is, my merged data frame should be :
ID1 time1 time2
2 2008-01-02 14:20 2008-01-02 14:29
5 2008-01-02 17:42 2008-01-02 17:55
My data frames have thousands of records. If the two data frames are d1 and d2,
d3<-merge(d1,d2,by.x=time1,by.y=time2)
will work only for exact matching. One possible option is to match the times for the date and hour times only (by filtering away the minute data).
But this is only a partial solution as I am not interested in data where the time difference is more than 15 minutes.
How can I make the merge to work for fuzzy matching?
Would it be easier to convert the times into data classes? Or, it better to treat them as strings and use regular expresssions for doing the matching?
I would appreciate any help that I can get.
Thanking You,
Ravi
More information about the R-help
mailing list