[R] How to form groups for this specific problem?

Adams, Jean jvadams at usgs.gov
Mon Mar 28 18:09:35 CEST 2016


Satish,

If you rearrange your data into a network of nodes and edges, you can use
the igraph package to identify disconnected (mutually exclusive) groups.

# example data
df <- data.frame(
  Component = c("C1", "C2", "C1", "C3", "C4", "C5"),
  TLA = c("TLA1", "TLA1", "TLA2", "TLA2", "TLA3", "TLA3")
)

# characterize data as a network of nodes and edges
nodes <- levels(unlist(df))
edges <- apply(df, 2, match, nodes)

# use the igraph package to identify disconnected groups
library(igraph)
g <- graph(edges)
ngroup <- clusters(g)$membership
df$Group <- ngroup[match(df$Component, nodes)]
df

  Component  TLA Group
1        C1 TLA1     1
2        C2 TLA1     1
3        C1 TLA2     1
4        C3 TLA2     1
5        C4 TLA3     2
6        C5 TLA3     2

Jean

On Sun, Mar 27, 2016 at 7:56 PM, Satish Vadlamani <
satish.vadlamani at gmail.com> wrote:

> Hello All:
> I would like to get some help with the following problem and understand how
> this can be done in R efficiently. The header is given in the data frame.
>
> *Component, TLA*
> C1, TLA1
> C2, TLA1
> C1, TLA2
> C3, TLA2
> C4, TLA3
> C5, TLA3
>
> Notice that C1 is a component of TLA1 and TLA2.
>
> I would like to form groups of mutually exclusive subsets and create a new
> column called group for this subset. For the above data, the subsets and
> the new group column value will be like so:
>
> *Component, TLA, Group*
> C1, TLA1, 1
> C2, TLA1, 1
> C1, TLA2, 1
> C3, TLA2, 1
> C4, TLA3, 2
> C5, TLA3, 2
>
> Appreciate any help on this. I could have looped through the observations
> and tried some logic but I did not try that yet.
>
> --
>
> Satish Vadlamani
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list