[R] Using indexing to manipulate data

Thu Mar 18 09:41:19 CET 2010

Here are two solutions.  The first uses merge and the second uses
sqldf.  They both do a self join picking off the unique pairs.  The
sqldf solution also sorts the result:

# input
DF <- structure(list(Actor = c("Jim", "Bob", "Bob", "Larry", "Alice",
"Tom", "Tom", "Tom", "Alice", "Nancy"), Act = c("A", "A", "C",
"D", "C", "F", "D", "A", "B", "B")), .Names = c("Actor", "Act"
), class = "data.frame", row.names = c(NA, -10L))

subset(unique(merge(DF, DF, by = 2)), Actor.x < Actor.y)

library(sqldf) # see http://sqldf.googlecode.com
sqldf("select A.Actor, A.Act, B.Act
	from DF A join DF B
	where A.Act = B.Act and A.Actor < B.Actor
	order by A.Act, A.Actor")

On Thu, Mar 18, 2010 at 1:05 AM, duncandonutz <dwadswor at unm.edu> wrote:
>
> I know one of R's advantages is it's ability to index, eliminating the need
> for control loops to select relevant data, so I thought this problem would
> be easy.  I can't crack it.  I have looked through past postings, but
> nothing seems to match this problem
>
> I have a data set with one column of actors and one column of acts.  I need
> a list that will give me a pair of actors in each row, provided they both
> participated in the act.
>
> Example:
>
> The Data looks like this:
> Jim         A
> Bob        A
> Bob        C
> Larry      D
> Alice      C
> Tom       F
> Tom       D
> Tom       A
> Alice      B
> Nancy    B
>
> I would like this:
> Jim      Bob
> Jim      Tom
> Bob     Alice
> Larry   Tom
> Alice    Nancy
>
> The order doesn't matter (Jim-Bob vs. Bob-Jim), but each pairing should be
> counted only once.
> Thanks!
>
> --
> View this message in context: http://n4.nabble.com/Using-indexing-to-manipulate-data-tp1597547p1597547.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>