[R] Merging fully overlapping groups

mdvaan mathijsdevaan at gmail.com
Wed Mar 14 04:56:33 CET 2012


Hi,

I have data on individuals (B) who participated in events (A). If ALL
participants in an event are a subset of the participants in another event I
would like to remove the smaller event and if the participants in one event
are exactly similar to the participants in another event I would like to
remove one of the events (I don't care which one). The following example
does that however it is extremely slow (and the true dataset is very large).
What would be a more efficient way to solve the problem? I really appreciate
your help. Thanks!  

DF <- data.frame(read.table(textConnection("  A  B
12095	 69832
12095	 51750
12095	 6734
18774	 51750
18774	 51733
18774	 6734
18774	 69833
19268	 51750
19268	 6734
19268	 51733
19268	 65251
5169	 54441
5169	 15480
5169	 3228
5966	 51733
5966	 65251
5966	 68197
5966	 6734
5966	 51750
5966	 69833
7189	 135523
7189	 65251
7189	 51733
7189	 69833
7189	 135522
7189	 68197
7189	 6734
7797	 51750
7797	 6734
7797	 69833
7866	 6734
7866	 69833
7866	 51733
8596	 51733
8596	 51750
8596	 65251
8677	 6734
8677	 51750
8677	 51733
8936	 68197
8936	 6734
8936	 65251
8936	 51733
9204	 51750
9204	 69833
9204	 6734
9204	 51733"),head=TRUE,stringsAsFactors=FALSE))

data <- unique(DF$A)
for (m in 1:length(data))
	{
	for (m in 1:length(data))
		{
		tdata <- data[-m]
		q <- 0
		for (n in 1:length(tdata))
			{
			if (length(which(DF[DF$A == data[m], 2] %in% DF[DF$A == tdata[n], 2] ==
TRUE)) == length(DF[DF$A == data[m], 2]))
				{
				q <- q + 1
				}
			}
		if (q > 0)
			{
			data <- data[-m]
			m <- m - 1
			}
		}
	}
DF <- DF[DF$A %in% data,]

--
View this message in context: http://r.789695.n4.nabble.com/Merging-fully-overlapping-groups-tp4470999p4470999.html
Sent from the R help mailing list archive at Nabble.com.



More information about the R-help mailing list