[R] Merge data frames but prefer values in one
JiHO
jo.lists at gmail.com
Thu Sep 10 17:21:12 CEST 2009
Hello everyone,
My problem is better explained with an example:
> x=data.frame(a=1:4,b=1:4,c=rnorm(4))
> x
a b c
1 1 1 -0.8821089
2 2 2 -0.7082583
3 3 3 -0.5948835
4 4 4 -1.8571443
> y=data.frame(a=c(1,3),b=3,c=rnorm(2))
> y
a b c
1 1 3 -0.273155973
2 3 3 0.009517862
Now I want to merge x and y by columns a and b, hence creating a
data.frame with all a:b combinations observed in x and y. That's
easily done with merge:
> merge(x,y,by=c("a","b"),all=T)
a b c.x c.y
1 1 1 -0.8821089 NA
2 1 3 NA -0.273155973
3 2 2 -0.7082583 NA
4 3 3 -0.5948835 0.009517862
5 4 4 -1.8571443 NA
But rather than two c columns I would want the merge to:
- keep the value in x if there is no corresponding value in y
- keep the value in y if there is no corresponding value in x
- prefer the value in y when the a:b combination exists in both x and y
So basically I want my result to look like:
a b c
1 1 1 -0.8821089
2 1 3 -0.2731559
3 2 2 -0.7082583
4 3 3 0.0095178
5 4 4 -1.8571443
I can't find a combinations of options for merge that does this. Is
there another fonction that would do that or do I have to resort to
some post-processing after merge? It seems that it might be something
like a "right merge" for data bases but I don't know this world at
all. I would be happy to look into sqldf if that allows to do things
like that.
Thanks in advance. Sincerely,
JiHO
---
http://maururu.net
More information about the R-help
mailing list