[R] Need to compare two columns in two data.frames and return all rows from df where rows values are missing
Bert Gunter
bgunter@4567 @end|ng |rom gm@||@com
Tue Jun 15 04:50:03 CEST 2021
merge(..., all = TRUE) essentially does this for you. rows with NA's are
the non-matches:
merge(df1,df2,by = "name",all = TRUE)
name score.x score.y
1 a 0.69280341 NA
2 b 0.04205953 NA
3 b 0.32792072 NA
4 b 0.89982497 NA
5 b 0.95450365 NA
6 b 0.99426978 NA
7 c 0.67757064 0.4659625
8 c 0.67757064 0.4137243
9 c 0.67757064 0.7584595
10 c 0.24608773 0.4659625
11 c 0.24608773 0.4137243
12 c 0.24608773 0.7584595
13 c 0.10292468 0.4659625
14 c 0.10292468 0.4137243
15 c 0.10292468 0.7584595
16 c 0.88953932 0.4659625
17 c 0.88953932 0.4137243
18 c 0.88953932 0.7584595
19 c 0.57263340 0.4659625
20 c 0.57263340 0.4137243
21 c 0.57263340 0.7584595
22 d 0.64050681 0.2316258
23 d 0.64050681 0.4145463
24 d 0.64050681 0.2330341
25 d 0.64050681 0.3688455
26 e NA 0.1428000
27 e NA 0.2164079
28 e NA 0.1524447
29 f NA 0.3181810
30 f NA 0.1388061
Bert Gunter
"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
On Mon, Jun 14, 2021 at 7:09 PM Jake Elmstedt <jake.elmstedt using gmail.com>
wrote:
> set.seed(123)
>
> df1 <- data.frame(name = sample(letters[1:4], 12, TRUE), score = runif(12))
> head(df1)
> #> name score
> #> 1 c 0.67757064
> #> 2 c 0.57263340
> #> 3 c 0.10292468
> #> 4 b 0.89982497
> #> 5 c 0.24608773
> #> 6 b 0.04205953
> table(df1[["name"]])
> #>
> #> a b c d
> #> 1 5 5 1
>
> df2 <- data.frame(name = sample(letters[3:6], 12, TRUE), score = runif(12))
> head(df2)
> #> name score
> #> 1 c 0.7584595
> #> 2 e 0.2164079
> #> 3 f 0.3181810
> #> 4 d 0.2316258
> #> 5 e 0.1428000
> #> 6 d 0.4145463
> table(df2[["name"]])
> #>
> #> c d e f
> #> 3 4 3 2
>
> df3 <- rbind(df1[!df1[["name"]] %in% df2[["name"]], ],
> df2[!df2[["name"]] %in% df1[["name"]], ])
> head(df3)
> #> name score
> #> 4 b 0.89982497
> #> 6 b 0.04205953
> #> 7 b 0.32792072
> #> 8 b 0.95450365
> #> 10 a 0.69280341
> #> 12 b 0.99426978
> table(df3[["name"]])
> #>
> #> a b e f
> #> 1 5 3 2
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
[[alternative HTML version deleted]]
More information about the R-help
mailing list