[R] How to add unique occasions based on date within a subject in R?

arun smartpink111 at yahoo.com
Thu Nov 21 20:42:22 CET 2013


Hi,
May be you can try:
###Use dput()

dat1 <- structure(list(trialno = c(11301L, 11301L, 11301L, 11301L, 11301L, 
11301L, 11301L, 11301L, 11301L, 11301L, 11302L, 11302L, 11302L, 
11302L, 11302L, 11302L, 11302L, 11302L, 11302L, 11302L), event = c("pm_intake", 
"am_intake", "pk1", "pm_intake", "am_intake", "pk1", "pk2", "pm_intake", 
"am_intake", "pk1", "pm_intake", "am_intake", "pk1", "pm_intake", 
"am_intake", "pk1", "pk2", "pm_intake", "am_intake", "pk1"), 
    date = c("2010-11-24", "2010-11-25", "2010-11-25", "2010-12-22", 
    "2010-12-23", "2010-12-23", "2010-12-23", "2011-02-02", "2011-02-03", 
    "2011-02-03", "2010-11-24", "2010-11-25", "2010-11-25", "2010-12-22", 
    "2010-12-23", "2010-12-23", "2010-12-23", "2011-02-02", "2011-02-03", 
    "2011-02-03"), time = c("19:00", "07:00", "10:30", "19:00", 
    "07:00", "09:54", "13:07", "19:00", "07:00", "11:30", "19:00", 
    "07:00", "10:30", "19:00", "07:00", "09:54", "13:07", "19:00", 
    "07:00", "11:30")), .Names = c("trialno", "event", "date", 
"time"), class = "data.frame", row.names = c("3", "4", "5", "6", 
"7", "8", "9", "10", "11", "12", "13", "14", "15", "16", "17", 
"18", "19", "20", "21", "22"))


splitData<- split(dat1, dat1$trialno) #using your code
res <-  unsplit(lapply(splitData,function(x) within(x,OCC <- cumsum(ave(seq_along(date),date,FUN=seq_along)==1))),dat1$trialno)

 res$OCC
 #[1] 1 2 2 3 4 4 4 5 6 6 1 2 2 3 4 4 4 5 6 6


A.K.





On Thursday, November 21, 2013 2:04 PM, Andrzej Bienczak <andrzej.bienczak at googlemail.com> wrote:
Hi All, 



I'm trying to figure out how in my data set to add a column including a
count of unique events based on date. Here is a part of my data set:



                trialno       event                   date          time

3              11301    pm_intake          2010-11-24          19:00

4              11301    am_intake          2010-11-25          07:00

5              11301    pk1                         2010-11-25
10:30

6              11301    pm_intake          2010-12-22          19:00

7              11301    am_intake          2010-12-23          07:00

8              11301    pk1                        2010-12-23
09:54

9              11301    pk2                         2010-12-23
13:07

10           11301    pm_intake          2011-02-02          19:00

11           11301    am_intake          2011-02-03          07:00

12           11301    pk1                        2011-02-03          11:30







Basically each date within each patient would indicate a new occasion. If
patient has just drug administration - it's one occasion but if patient had
drug administration and two measurements on the same day, they all count as
the same occasion. The data set does not have a regular patters (each
patient has a different number of events on each date and events in total).

What I'm trying to achieve is:



                trialno       event                     date          time
OCC

3              11301    pm_intake          2010-11-24          19:00      1

4              11301    am_intake          2010-11-25          07:00      2

5              11301    pk1                         2010-11-25
10:30      2

6              11301    pm_intake          2010-12-22          19:00      3

7              11301    am_intake          2010-12-23          07:00      4

8              11301    pk1                         2010-12-23
09:54      4

9              11301    pk2                         2010-12-23
13:07      4

10           11301    pm_intake          2011-02-02          19:00      5

11           11301    am_intake          2011-02-03          07:00      6

12           11301    pk1                        2011-02-03          11:30
6



I think I should apply some kind of a loop to identify within each patient
unique dates and count them...

I thought about splitting the whole data set into patients using split
function:

splitData<- split(data, data$trialno)



And applying lapply and transform to add a new column OCC (occasion) but I
don't know how to count those as integers...

I was thinking:



splitData<- lapply(splitData, function(df) {

       transform(df, OCC= ???????????????   )}

do.call ("rbind", splitData)



I know how to do it in Excell:

=IF(D5=D4, E4,E4+1)

(if the cell value in neighbouring cell is same as in the cell above, then
value in my cell is same as in one above, else it's one greater)-this way
first cell in E column has to be 1 and the others are integers of new date
events.

Help much appreciated!

Andrzej




    [[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




More information about the R-help mailing list