[R] Select random observation from a group

arun smartpink111 at yahoo.com
Fri Mar 28 03:28:40 CET 2014



Hi,
May be this helps:
set.seed(42)
indx <- with(df,tapply(seq_along(IndividualID), FamilyID,FUN=sample,1))
 df[indx,]
# FamilyID IndividualID DadID MomID Sex
#4        1          104     0     0   2
#8        2          204   202   203   2

#or
library(plyr)
 ddply(df,.(FamilyID),function(x) x[with(x, sample(seq_along(IndividualID),1)),])

A.K.


On Thursday, March 27, 2014 9:46 PM, Whitney Melroy <wmelroy827 at gmail.com> wrote:
Hello, 

I have a dataset with family data. For an analysis, I need to select one subject per family at random. 

Here is an example of what my data look like: 

FamilyID IndividualID          DadID    MomID      Sex
1         101            103        104            1
1         102            103        104            2
1         103            0        0            1
1         104            0        0            2
2         201            202        203            1
2         202            202        203            2
2         203            202        203            1
2         204            202        203            2

I want to randomly select ONE subject for each family (there are roughly 2300 families) and make a new dataframe. 

Here is what I tried so far, with no success: 


Uniq.fam.id<-df[unique(df$FamilyID),]

Uniq.fam.id <- df[sample(unique(df$FamilyID)),]


Uniq.fam<-unique(df$FamilyID)
Uniq.fam.id <- df[sample(Uniq.fam),]

I would be eternally grateful for any help. 

Thanks, 

Whitney 
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.





More information about the R-help mailing list