[R] Randomly drawing observations from factors.

Economics Guy economics.guy at gmail.com
Thu Jul 31 22:20:47 CEST 2008


I have a large data set where one of the columns needs be a unique
identifier (ID) for each row. However for a few of the rows they have
the same ID. What I need to do is randomly draw one of the rows and
keep it in the data frame and drop all the others which have the same
ID.

For example:

v1 <- c(1,2,3,4,5,6,7)
v2 <- c(10,20,30,40,50,60,70)
ID <- c("A","A","B","B","C","D","E")
DF <- data.frame(v1,v2,ID)

But I only need one of the A rows and one of the B rows in the data
frame. I tried making ID a factor and using apply() to randomly draw
one but I could not get it to work.

Any ideas would be greatly appreciated.

Thanks,

EG



More information about the R-help mailing list