[R] Conditional operation on multiple columns from two data frames

jim holtman jholtman at gmail.com
Mon Dec 29 03:43:19 CET 2008


Merge will give you an output like below and you should be able to
find the value you need:

> merge(df1, df2, by='ID', all=TRUE)
   ID              Date.x       y.x       x.x              Date.y
 y.y      x.y
1   1                <NA>        NA        NA 2007-05-31 23:00:00
-20.82907 217.7022
2   1                <NA>        NA        NA 2007-06-01 11:00:00
-20.82500 217.6980
3   1                <NA>        NA        NA 2007-06-01 23:00:00
-20.82236 217.7012
4   2 2007-05-30 20:21:56        NA        NA 2007-02-19 00:00:00
-37.40960 178.3060
5   2 2007-05-30 20:21:56        NA        NA 2007-02-19 12:00:00
-37.37515 178.3360
6   2 2007-05-30 20:21:56        NA        NA 2007-02-20 00:00:00
-37.27700 178.3780
7   2 2007-05-30 20:21:56        NA        NA 2007-02-20 12:00:00
-37.17855 178.3845
8   2 2007-05-30 20:21:56        NA        NA 2007-02-21 00:00:00
-37.07595 178.3890
9   2 2007-05-30 20:21:56        NA        NA 2007-02-21 12:00:00
-36.91965 178.3965
10  2 2007-05-30 20:21:56        NA        NA 2007-02-22 00:00:00
-36.74345 178.3910
11  2 2007-05-30 20:21:56        NA        NA 2007-02-22 12:00:00
-36.49245 178.5090
12  2 2007-05-30 20:21:56        NA        NA 2007-02-23 00:00:00
-36.24150 178.5750
13  2 2007-05-30 20:21:56        NA        NA 2007-02-23 12:00:00
-35.95500 178.6280
14  2 2007-05-30 18:48:20        NA        NA 2007-02-19 00:00:00
-37.40960 178.3060
15  2 2007-05-30 18:48:20        NA        NA 2007-02-19 12:00:00
-37.37515 178.3360
16  2 2007-05-30 18:48:20        NA        NA 2007-02-20 00:00:00
-37.27700 178.3780
17  2 2007-05-30 18:48:20        NA        NA 2007-02-20 12:00:00
-37.17855 178.3845
18  2 2007-05-30 18:48:20        NA        NA 2007-02-21 00:00:00
-37.07595 178.3890
19  2 2007-05-30 18:48:20        NA        NA 2007-02-21 12:00:00
-36.91965 178.3965
20  2 2007-05-30 18:48:20        NA        NA 2007-02-22 00:00:00
-36.74345 178.3910
21  2 2007-05-30 18:48:20        NA        NA 2007-02-22 12:00:00
-36.49245 178.5090
22  2 2007-05-30 18:48:20        NA        NA 2007-02-23 00:00:00
-36.24150 178.5750
23  2 2007-05-30 18:48:20        NA        NA 2007-02-23 12:00:00
-35.95500 178.6280
24  2 2007-06-05 08:34:37        NA        NA 2007-02-19 00:00:00
-37.40960 178.3060
25  2 2007-06-05 08:34:37        NA        NA 2007-02-19 12:00:00
-37.37515 178.3360
26  2 2007-06-05 08:34:37        NA        NA 2007-02-20 00:00:00
-37.27700 178.3780
27  2 2007-06-05 08:34:37        NA        NA 2007-02-20 12:00:00
-37.17855 178.3845
28  2 2007-06-05 08:34:37        NA        NA 2007-02-21 00:00:00
-37.07595 178.3890
29  2 2007-06-05 08:34:37        NA        NA 2007-02-21 12:00:00
-36.91965 178.3965
30  2 2007-06-05 08:34:37        NA        NA 2007-02-22 00:00:00
-36.74345 178.3910
31  2 2007-06-05 08:34:37        NA        NA 2007-02-22 12:00:00
-36.49245 178.5090
32  2 2007-06-05 08:34:37        NA        NA 2007-02-23 00:00:00
-36.24150 178.5750
33  2 2007-06-05 08:34:37        NA        NA 2007-02-23 12:00:00
-35.95500 178.6280
34  2 2007-06-02 09:39:58 -20.10100 -134.4210 2007-02-19 00:00:00
-37.40960 178.3060
35  2 2007-06-02 09:39:58 -20.10100 -134.4210 2007-02-19 12:00:00
-37.37515 178.3360
36  2 2007-06-02 09:39:58 -20.10100 -134.4210 2007-02-20 00:00:00
-37.27700 178.3780
37  2 2007-06-02 09:39:58 -20.10100 -134.4210 2007-02-20 12:00:00
-37.17855 178.3845
38  2 2007-06-02 09:39:58 -20.10100 -134.4210 2007-02-21 00:00:00
-37.07595 178.3890
39  2 2007-06-02 09:39:58 -20.10100 -134.4210 2007-02-21 12:00:00
-36.91965 178.3965
40  2 2007-06-02 09:39:58 -20.10100 -134.4210 2007-02-22 00:00:00
-36.74345 178.3910
41  2 2007-06-02 09:39:58 -20.10100 -134.4210 2007-02-22 12:00:00
-36.49245 178.5090
42  2 2007-06-02 09:39:58 -20.10100 -134.4210 2007-02-23 00:00:00
-36.24150 178.5750
43  2 2007-06-02 09:39:58 -20.10100 -134.4210 2007-02-23 12:00:00
-35.95500 178.6280
44  2 2007-06-02 17:36:20        NA        NA 2007-02-19 00:00:00
-37.40960 178.3060
45  2 2007-06-02 17:36:20        NA        NA 2007-02-19 12:00:00
-37.37515 178.3360
46  2 2007-06-02 17:36:20        NA        NA 2007-02-20 00:00:00
-37.27700 178.3780
47  2 2007-06-02 17:36:20        NA        NA 2007-02-20 12:00:00
-37.17855 178.3845
48  2 2007-06-02 17:36:20        NA        NA 2007-02-21 00:00:00
-37.07595 178.3890
49  2 2007-06-02 17:36:20        NA        NA 2007-02-21 12:00:00
-36.91965 178.3965
50  2 2007-06-02 17:36:20        NA        NA 2007-02-22 00:00:00
-36.74345 178.3910
51  2 2007-06-02 17:36:20        NA        NA 2007-02-22 12:00:00
-36.49245 178.5090
52  2 2007-06-02 17:36:20        NA        NA 2007-02-23 00:00:00
-36.24150 178.5750
53  2 2007-06-02 17:36:20        NA        NA 2007-02-23 12:00:00
-35.95500 178.6280
54  2 2007-06-03 20:31:35        NA        NA 2007-02-19 00:00:00
-37.40960 178.3060
55  2 2007-06-03 20:31:35        NA        NA 2007-02-19 12:00:00
-37.37515 178.3360
56  2 2007-06-03 20:31:35        NA        NA 2007-02-20 00:00:00
-37.27700 178.3780
57  2 2007-06-03 20:31:35        NA        NA 2007-02-20 12:00:00
-37.17855 178.3845
58  2 2007-06-03 20:31:35        NA        NA 2007-02-21 00:00:00
-37.07595 178.3890
59  2 2007-06-03 20:31:35        NA        NA 2007-02-21 12:00:00
-36.91965 178.3965
60  2 2007-06-03 20:31:35        NA        NA 2007-02-22 00:00:00
-36.74345 178.3910
61  2 2007-06-03 20:31:35        NA        NA 2007-02-22 12:00:00
-36.49245 178.5090
62  2 2007-06-03 20:31:35        NA        NA 2007-02-23 00:00:00
-36.24150 178.5750
63  2 2007-06-03 20:31:35        NA        NA 2007-02-23 12:00:00
-35.95500 178.6280
64  3 2007-03-23 06:33:26        NA        NA                <NA>
  NA       NA
65  3 2007-03-23 17:02:21        NA        NA                <NA>
  NA       NA
66  3 2007-02-19 00:00:00 -37.40822  178.3440                <NA>
  NA       NA
67  3 2007-03-10 19:58:18        NA        NA                <NA>
  NA       NA
68  3 2007-03-25 06:41:03 -31.18100  178.7700                <NA>
  NA       NA
69  3 2007-03-24 04:27:44 -31.16400  178.7650                <NA>
  NA       NA
70  3 2007-03-24 11:35:47        NA        NA                <NA>
  NA       NA


On Sat, Dec 27, 2008 at 11:55 PM, Tim Sippel <tsippel at gmail.com> wrote:
> Sorry...I was unaware I needed to 'deput' to deparse data examples into a
> useful format.  The data examples I provided initially are deparsed below:
>
> df1<-
> list(ID = c(2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3), Date = c("2007-05-30
> 18:48:20",
> "2007-05-30 20:21:56", "2007-06-02 09:39:58", "2007-06-02 17:36:20",
> "2007-06-03 20:31:35", "2007-06-05 08:34:37", "2007-02-19 00:00:00",
> "2007-03-10 19:58:18", "2007-03-23 06:33:26", "2007-03-23 17:02:21",
> "2007-03-24 04:27:44", "2007-03-24 11:35:47", "2007-03-25 06:41:03"
> ), y = c(NA, NA, -20.101, NA, NA, NA, -37.4082166666667, NA,
> NA, NA, -31.164, NA, -31.181), x = c(NA, NA, -134.421, NA, NA,
> NA, 178.34395, NA, NA, NA, 178.765, NA, 178.77))
>
> df2<-
> list(ID = c(1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2), Date = c("2007-05-31
> 23:00:00",
> "2007-06-01 11:00:00", "2007-06-01 23:00:00", "2007-02-19 00:00:00",
> "2007-02-19 12:00:00", "2007-02-20 00:00:00", "2007-02-20 12:00:00",
> "2007-02-21 00:00:00", "2007-02-21 12:00:00", "2007-02-22 00:00:00",
> "2007-02-22 12:00:00", "2007-02-23 00:00:00", "2007-02-23 12:00:00"
> ), y = c(-20.829075, -20.824995, -20.822355, -37.4096, -37.37515,
> -37.277, -37.17855, -37.07595, -36.91965, -36.74345, -36.49245,
> -36.2415, -35.955), x = c(217.7022, 217.698, 217.70115, 178.306,
> 178.336, 178.378, 178.3845, 178.389, 178.3965, 178.391, 178.509,
> 178.575, 178.628))
>
> On Sun, Dec 28, 2008 at 4:36 PM, Tim Sippel <tsippel at gmail.com> wrote:
>
>> Hi-
>> I have two data frames for which I wish to conditionally subtract the
>> values of one dataframe from the other.
>>
>>  I want to subtract df1$x from df2$x when their id is equal and the
>> absolute value of difference between the dates is 12 hours or less.  If
>> there is no match of equal id's and dates less than 12 hours apart I want it
>> to return "NA".  Note that df1 has missing values in x and y columns, but
>> df2 does not have any missing values.  I think I'm close with the code at
>> the end, but would appreciate a bit of help getting over this hurdle.
>>
>> Sample dataframes follow:
>>
>> *df1<-*
>>      ID Date y x STM07.1 30-05-2007 18:48  STM07.1 30-05-2007 20:21
>> STM07.1 02-06-2007 09:39 -20.101 -134.421 STM07.1 02-06-2007 17:36
>> STM07.1 03-06-2007 20:31  STM07.1 05-06-2007 08:34  STM07.2 19-02-2007
>> 00:00 -37.4082 178.34395 STM07.2 10-03-2007 19:58  STM07.2 23-03-2007
>> 06:33  STM07.2 23-03-2007 17:02  STM07.2 24-03-2007 04:27 -31.164 178.765
>> STM07.2 24-03-2007 11:35  STM07.2 25-03-2007 06:41 -31.181 178.77 STM07.2 26-03-2007
>> 05:11
>> **
>> *df2<-*
>>      ID Date y x STM07.1 31/05/2007 23:00 -20.8291 217.7022 STM07.1 1/06/2007
>> 11:00 -20.825 217.698 STM07.1 1/06/2007 23:00 -20.8224 217.70115 STM07.2 19/02/2007
>> 0:00 -37.4096 178.306 STM07.2 19/02/2007 12:00 -37.3752 178.336 STM07.2 20/02/2007
>> 0:00 -37.277 178.378 STM07.2 20/02/2007 12:00 -37.1786 178.3845 STM07.2 21/02/2007
>> 0:00 -37.076 178.389 STM07.2 21/02/2007 12:00 -36.9197 178.3965 STM07.2 22/02/2007
>> 0:00 -36.7435 178.391 STM07.2 22/02/2007 12:00 -36.4925 178.509 STM07.2 23/02/2007
>> 0:00 -36.2415 178.575 STM07.2 23/02/2007 12:00 -35.955 178.628
>>
>> d.x <- ifelse(c(test = df1$ID==df2$ID && abs(difftime(as.POSIXlt(df1$Date),
>> as.POSIXlt(df2$Date),
>>   units="hours") < 12)), yes = delta.lon=df1$x-df2$x, no = "NA")
>>
>> I'm using R 2.7.1.
>>
>> Cheers,
>>
>> Tim
>>
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?



More information about the R-help mailing list