[R] Find "undirected" duplicates in a tibble

Eric Berger er|cjberger @end|ng |rom gm@||@com
Fri Aug 20 17:54:07 CEST 2021


Nice Rui.
Here's a version in base R with no apply().

unique(data.frame(V1=pmin(x$Source,x$Target), V2=pmax(x$Source,x$Target)))

On Fri, Aug 20, 2021 at 6:43 PM Rui Barradas <ruipbarradas using sapo.pt> wrote:

> Hello,
>
> This seems elegant to me but it's also the slowest, courtesy sort.
>
> apply(x, 1, sort) |> t() |> unique()
>
>
> (My tests show that for small inputs Greg's base apply is fastest, with
> nrow(x) > 700, Eric's dplyr is fastest)
>
>
> Hope this helps,
>
> Rui Barradas
>
> Às 15:13 de 20/08/21, Greg Minshall escreveu:
> > Eric,
> >
> >> x %>% transmute( a=pmin(Source,Target), b=pmax(Source,Target)) %>%
> >>    unique() %>% rename(Source=a, Target=b)
> >
> > ah, very nice.  i have trouble remembering, e.g., unique().
> >
> > fwiw, (hopefully) here's a baser version.
> > ----
> >    x = data.frame(Source=rep(1:3,4),
> Target=c(rep(1,3),rep(2,3),rep(3,3),rep(4,3)))
> >
> >    y <- apply(x, 1, function(y) return (c(A=min(y), B=max(y))))
> >    unique(t(y))
> > ----
> >
> > cheers, Greg
> >
> > ______________________________________________
> > R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list