[R] combining unequal dataframes based on a common grouping factor
Chel Hee Lee
chl948 at mail.usask.ca
Thu Dec 4 03:17:48 CET 2014
> frame1
ID GROUP PROP_AREA
1 1 A 0.33
2 2 A 0.33
3 3 A 0.33
4 4 B 0.50
5 5 B 0.50
6 6 C 1.00
7 7 D 1.00
> frame2
GROUP VALUE1 VALUE2
1 A 10 5
2 B 20 10
3 C 30 15
4 D 40 20
>
> obj1 <- merge(x=frame1, y=frame2, by="GROUP")
> obj1$rval1 <- obj1$PROP_AREA * obj1$VALUE1
> obj1$rval2 <- obj1$PROP_AREA * obj1$VALUE2
> obj1
GROUP ID PROP_AREA VALUE1 VALUE2 rval1 rval2
1 A 1 0.33 10 5 3.3 1.65
2 A 2 0.33 10 5 3.3 1.65
3 A 3 0.33 10 5 3.3 1.65
4 B 4 0.50 20 10 10.0 5.00
5 B 5 0.50 20 10 10.0 5.00
6 C 6 1.00 30 15 30.0 15.00
7 D 7 1.00 40 20 40.0 20.00
>
> idx <- match(x=frame1$GROUP, table=frame2$GROUP)
> rval1 <- frame1["PROP_AREA"] * frame2[idx, "VALUE1"]
> rval2 <- frame1["PROP_AREA"] * frame2[idx, "VALUE2"]
> cbind("ID"=frame1[idx, "ID"], rval1, rval2)
ID PROP_AREA PROP_AREA
1 1 3.3 1.65
2 1 3.3 1.65
3 1 3.3 1.65
4 2 10.0 5.00
5 2 10.0 5.00
6 3 30.0 15.00
7 4 40.0 20.00
>
Is this what you are looking for? I hope this helps.
Chel Hee Lee
On 12/03/2014 03:14 PM, Brock Huntsman wrote:
> I apologize if this is a relatively easy problem, but have been stuck on
> this issue for a few days. I am attempting to combine values from 2
> separate dataframes. Each dataframe contains a shared identifier (GROUP).
> Dataframe 1 (3272 rows x 3 columns) further divides this shared grouping
> factor into unique identifiers (ID), as well as contains the proportion of
> the GROUP area of which the unique identifier consists (PROP_AREA).
> Dataframe 2 (291 x 14976) in addition to consisting of the shared
> identifier, also has numerous columns consisting of values (VALUE1,
> VALUE2). I would like to multiply the PROP_AREA in dataframe 1 by each
> value in dataframe 2 (VALUE1 through VALUE14976) based on the GROUP factor,
> constructing a final dataframe of size 3272 x 14976. An example of the data
> frames are as follows:
>
>
> frame1:
>
> ID
>
> GROUP
>
> PROP_AREA
>
> 1
>
> A
>
> 0.33
>
> 2
>
> A
>
> 0.33
>
> 3
>
> A
>
> 0.33
>
> 4
>
> B
>
> 0.50
>
> 5
>
> B
>
> 0.50
>
> 6
>
> C
>
> 1.00
>
> 7
>
> D
>
> 1.00
>
>
>
> frame2:
>
> GROUP
>
> VALUE1
>
> VALUE2
>
> A
>
> 10
>
> 5
>
> B
>
> 20
>
> 10
>
> C
>
> 30
>
> 15
>
> D
>
> 40
>
> 20
>
>
>
> Desired dataframe
>
> frame3:
>
> ID
>
> VALUE1
>
> VALUE2
>
> 1
>
> 3.3
>
> 1.65
>
> 2
>
> 3.3
>
> 1.65
>
> 3
>
> 3.3
>
> 1.65
>
> 4
>
> 10
>
> 5
>
> 5
>
> 10
>
> 5
>
> 6
>
> 30
>
> 15
>
> 7
>
> 40
>
> 20
>
>
>
>
>
> I assume I would need to use the %in% function or if statements, but am
> unsure how to write the code. I have attempted to construct a for loop with
> an if statement, but have not been successful as of yet.
>
>
> for(i in 1:nrow(frame1)) {
>
> for(j in 2:ncol(frame2)) {
>
> if (frame1$GROUP[i] == frame2$GROUP[i]) {
>
> frame3[i,j+1] <- frame1$PROP_AREA[i]*frame2[i,j+1]
>
> }
>
> }
>
> }
>
>
> Any advice on suggested code or packages to read up on would be much
> appreciated.
>
> Brock
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
More information about the R-help
mailing list