[R] Transforming relational data

mathijsdevaan mathijsdevaan at gmail.com
Mon Feb 14 18:22:12 CET 2011


Hi,

I have a large dataset with info on individuals (B) that have been involved
in projects (A) during multiple years (C). The dataset contains three
columns: A, B, C. Example:
   
   A  B  C
1 1  a  1999
2 1  b  1999
3 1  c  1999
4 1  d  1999
5 2  c  2001
6 2  d  2001
7 3  a  2004
8 3  c  2004
9 3  d  2004

I am interested in how well all the individuals in a project know each
other. To calculate this team familiarity measure I want to sum the
familiarity between all individual pairs in a team. The familiarity between
each individual pair in a team is calculated as the summation of each pair's
prior co-appearance in a project divided by the total number of team
members. So the team familiarity in project 3 = (1/4+1/4) + (1/4+1/4+1/2) +
(1/4+1/4+1/2) = 2,5 or a has been in project 1 (of size 4) with c and d >
1/4+1/4 and c has been in project 1 (of size 4) with 1 and d > 1/4+1/4 and c
has been in project 2 (of size 2) with d > 1/2.

I think that the best way to do it is to transform the data into an edgelist
(each pair in one row/two columns) and then creating two additional columns
for the strength of the familiarity and the year of the project in which the
pair was active. The problem is that I am stuck already in the first step.
So the question is: how do I go from the current data structure to a list of
projects and the familiarity of its team members?

Your help is very much appreciated. Thanks!
-- 
View this message in context: http://r.789695.n4.nabble.com/Transforming-relational-data-tp3305398p3305398.html
Sent from the R help mailing list archive at Nabble.com.



More information about the R-help mailing list