[R] data manipulation in R

Thomas Lumley tlumley at u.washington.edu
Mon Apr 16 17:54:14 CEST 2001

On Sun, 15 Apr 2001, Patrick Ball wrote:

> Dear List:
> I have a data manipulation problem that I was unable
> to solve in R.  I did it in SQL, and it may be that
> the solution in R is to do it in SQL, but I wondered
> if people could imagine a vector-based solution.
> Imagine a list A[i] of observers who observe some set
> of events B[j].  Each observer i may observe one or
> more events, and each event j may have been observed
> by one or more observers.  Thus the data are a
> lower-triangular array AxB where each cell [i,j] has a
> zero or one indicating whether observer i saw event j.
> I am interested in how observers cluster in circuits
> whereby observer _a_ sees events _1,2,3_, observer _b_
> sees events _2,4,5_, observer _c_ sees event _4_, and
> observer _d_ sees _4,6,7_.  Observers a, b, c, d
> comprise a circuit linked by the events they jointly
> observed.

I don't see why this is a lower-triangular matrix. If every observer saw
every event wouldn't it be a rectangular matrix of 1s?

You can solve this in R, but I shouldn't think it would be very efficient.
I think it's going to involve iteration over either the events or the

One solution:

Suppose A is the matrix of zeros & ones, in your case
> A
     [,1] [,2] [,3] [,4] [,5] [,6] [,7]
[1,]    1    1    1    0    0    0    0
[2,]    0    1    0    1    1    0    0
[3,]    0    0    0    1    0    0    0
[4,]    0    0    0    1    0    1    1

We can make an incidence matrix B for the graph


And now find the connected components by powering up B

for(i in 1:ceiling(log(nobs,2))){
	if (all(Dnew==D)) break


I think we now have D[i,j]==TRUE if i,j are in the same circuit.


Thomas Lumley			Asst. Professor, Biostatistics
tlumley at u.washington.edu	University of Washington, Seattle

r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch

More information about the R-help mailing list