[R] vector indexing problem in multilevel data: assigning a specific value to all group members

Dimitris Rizopoulos d.rizopoulos at erasmusmc.nl
Tue Dec 22 12:05:35 CET 2009


one approach is:

Dat <- read.table(textConnection("
   personId groupId groupLeader someAttribute leaderAttribute
1        1      17           0   0.145833333              NA
2        2      17           1   0.218750000              NA
3        3      17           0   0.089743590              NA
4        4      22           0   0.003875969              NA
5        5      22           0   0.086486486              NA
6        6      22           0   0.218750000              NA
7        7      22           1   0.089743590              NA
8        8      37           1   0.016129032              NA
9        9      37           0   0.151898734              NA
"), header = TRUE)
closeAllConnections()

Dat$leaderAttribute <- with(Dat, {
     f <- factor(groupId, levels = unique(groupId))
     rep(tapply(someAttribute, list(f, groupLeader), "[", 1)[, 2],
         tapply(someAttribute, f, length))
})

Dat


I hope it helps.

Best,
Dimitris


Bertolt Meyer wrote:
> Dear List,
> 
> I work with multilevel data from psychological group experiments and 
> have frequently encountered a situation for which I haven't found an 
> elegant solution: I need to assign the value of a specific group member 
> to all members of the group. For example, I have a group leader 
> (identified by a binary vector) and some attribute for all group 
> members. I want to create a new vector that holds the attribute of the 
> group leader for each individual in the data frame (code at the bottom 
> of the post):
> 
>   personId groupId groupLeader someAttribute leaderAttribute
> 1        1      17           0   0.145833333              NA
> 2        2      17           1   0.218750000              NA
> 3        3      17           0   0.089743590              NA
> 4        4      22           0   0.003875969              NA
> 5        5      22           0   0.086486486              NA
> 6        6      22           0   0.218750000              NA
> 7        7      22           1   0.089743590              NA
> 8        8      37           1   0.016129032              NA
> 9        9      37           0   0.151898734              NA
> 
> I need this:
> 
>   personId groupId groupLeader someAttribute leaderAttribute
> 1        1      17           0   0.145833333     0.218750000
> 2        2      17           1   0.218750000     0.218750000
> 3        3      17           0   0.089743590     0.218750000
> 4        4      22           0   0.003875969     0.089743590
> 5        5      22           0   0.086486486     0.089743590
> 6        6      22           0   0.218750000     0.089743590
> 7        7      22           1   0.089743590     0.089743590
> 8        8      37           1   0.016129032     0.016129032
> 9        9      37           0   0.151898734     0.016129032
> 
> So far, my attemps along the lines of
> 
> df$leaderAttribute <- df$someAttribute[df$groupLeader == 1][df$groupId]
> 
> have failed if the groups were not numbered with 1, 2, 3... as in the 
> example above. I need something simple for transforming the groupId 
> vector from 17, 17, 17, 22... to 1,1,1,2... for doing the second 
> indexing. It seems like a simple problem, but I am unable to get it 
> right. I had to fall back to building a specific function employing 
> nested for() loops for achieving this. However, this is error prone and 
> very slow, as some of my data sets are very large. What am I missing?
> 
> Any help would be greatly appreciated.
> Regards,
> Bertolt
> 
> Code:
> 
> personId <- c(1,2,3,4,5,6,7,8,9)
> groupId <- c(17,17,17,22,22,22,22,37,37)
> groupLeader <- c(0,1,0,0,0,0,1,1,0)
> someAttribute <- c(0.145833333, 0.218750000, 0.089743590, 0.003875969, 
> 0.086486486, 0.218750000, 0.089743590, 0.016129032, 0.151898734)
> leaderAttribute <- c(NA,NA,NA,NA,NA,NA,NA,NA,NA)
> 
> df <- cbind(personId, groupId, groupLeader, someAttribute, leaderAttribute)
> df <- as.data.frame(df)
> df
> 
> rm(personId, groupId, groupLeader, someAttribute, leaderAttribute, df)
> 

-- 
Dimitris Rizopoulos
Assistant Professor
Department of Biostatistics
Erasmus University Medical Center

Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands
Tel: +31/(0)10/7043478
Fax: +31/(0)10/7043014




More information about the R-help mailing list