[R] insert values based on common ID
Marc Schwartz
marc_schwartz at me.com
Fri May 14 17:24:53 CEST 2010
On May 14, 2010, at 10:15 AM, emorway wrote:
>
> Forum,
>
> with the datasets a and b below, I'm trying to establish a relationship
> based on the common column "week" and insert the value from the column
> weekAvg in b to the column weekAvg in a. The dataset a is several thousand
> lines long. I've tried looking at 'match', writing functions, 'rbind.fill'
> and various search terms in this forum to no avail. Thanks...
>
> a<-read.table(textConnection("row ID date Reading WellID week weekAvg
> 1 Well_80-2 6/12/2000 2:00 -202.034 80-2 0 NA
> 2 Well_80-2 6/12/2000 3:00 -201.018 80-2 0 NA
> 3 Well_80-2 6/12/2000 4:00 -199.494 80-2 0 NA
> 4 Well_80-2 6/12/2000 5:00 -197.716 80-2 0 NA
> 5 Well_80-2 6/12/2000 6:00 -190.858 80-2 0 NA
> 6 Well_80-2 6/12/2000 7:00 -181.460 80-2 0 NA
> 7 Well_80-2 6/19/2000 10:00 -166.728 80-2 1 NA
> 8 Well_80-2 6/19/2000 11:00 -167.490 80-2 1 NA
> 9 Well_80-2 6/19/2000 12:00 -167.490 80-2 1 NA
> 10 Well_80-2 6/19/2000 13:00 -167.490 80-2 1 NA
> 11 Well_80-2 6/19/2000 14:00 -168.506 80-2 1 NA
> 12 Well_80-2 6/19/2000 15:00 -168.506 80-2 1 NA"),header=T)
> closeAllConnections()
>
> b<-read.table(textConnection("week weekAvg
> 0 -147.3726
> 1 -181.3429
> 2 -151.7208
> 3 -188.8653
> 4 -163.7465
> 5 -161.6873
> 6 -158.5168
> 7 -146.6136
> 8 -175.4351
> 9 -100.9450
> 10 -151.3655
> 11 -125.8975
> 12 -162.5993"),header=T)
> closeAllConnections()
See ?merge and ?subset
merge() performs a SQL-like 'join' operation.
> merge(subset(a, select = -weekAvg), b, by = "week", all.x = TRUE)
week row ID date Reading WellID weekAvg
1 0 Well_80-2 6/12/2000 2:00 -202.034 80-2 -147.3726
2 0 Well_80-2 6/12/2000 3:00 -201.018 80-2 -147.3726
3 0 Well_80-2 6/12/2000 4:00 -199.494 80-2 -147.3726
4 0 Well_80-2 6/12/2000 5:00 -197.716 80-2 -147.3726
5 0 Well_80-2 6/12/2000 6:00 -190.858 80-2 -147.3726
6 0 Well_80-2 6/12/2000 7:00 -181.460 80-2 -147.3726
7 1 Well_80-2 6/19/2000 10:00 -166.728 80-2 -181.3429
8 1 Well_80-2 6/19/2000 11:00 -167.490 80-2 -181.3429
9 1 Well_80-2 6/19/2000 12:00 -167.490 80-2 -181.3429
10 1 Well_80-2 6/19/2000 13:00 -167.490 80-2 -181.3429
11 1 Well_80-2 6/19/2000 14:00 -168.506 80-2 -181.3429
12 1 Well_80-2 6/19/2000 15:00 -168.506 80-2 -181.3429
In the above, I am using subset() on 'a' to remove the pre-existing 'weekAvg' column, so that you only end up with one such column post merge. If you don't do this, you will have a 'weekAvg.x' and 'weekAvg.y' in the result.
The two data frames are then merge()d on the common 'week' column.
The 'all.x = TRUE', retains all rows in 'a' that match to the 'week' value in 'b'. Otherwise known as a 'left outer join'.
HTH,
Marc Schwartz
More information about the R-help
mailing list