[R] How to add unique occasions based on date within a subject in R?
arun
smartpink111 at yahoo.com
Thu Nov 21 20:42:22 CET 2013
Hi,
May be you can try:
###Use dput()
dat1 <- structure(list(trialno = c(11301L, 11301L, 11301L, 11301L, 11301L,
11301L, 11301L, 11301L, 11301L, 11301L, 11302L, 11302L, 11302L,
11302L, 11302L, 11302L, 11302L, 11302L, 11302L, 11302L), event = c("pm_intake",
"am_intake", "pk1", "pm_intake", "am_intake", "pk1", "pk2", "pm_intake",
"am_intake", "pk1", "pm_intake", "am_intake", "pk1", "pm_intake",
"am_intake", "pk1", "pk2", "pm_intake", "am_intake", "pk1"),
date = c("2010-11-24", "2010-11-25", "2010-11-25", "2010-12-22",
"2010-12-23", "2010-12-23", "2010-12-23", "2011-02-02", "2011-02-03",
"2011-02-03", "2010-11-24", "2010-11-25", "2010-11-25", "2010-12-22",
"2010-12-23", "2010-12-23", "2010-12-23", "2011-02-02", "2011-02-03",
"2011-02-03"), time = c("19:00", "07:00", "10:30", "19:00",
"07:00", "09:54", "13:07", "19:00", "07:00", "11:30", "19:00",
"07:00", "10:30", "19:00", "07:00", "09:54", "13:07", "19:00",
"07:00", "11:30")), .Names = c("trialno", "event", "date",
"time"), class = "data.frame", row.names = c("3", "4", "5", "6",
"7", "8", "9", "10", "11", "12", "13", "14", "15", "16", "17",
"18", "19", "20", "21", "22"))
splitData<- split(dat1, dat1$trialno) #using your code
res <- unsplit(lapply(splitData,function(x) within(x,OCC <- cumsum(ave(seq_along(date),date,FUN=seq_along)==1))),dat1$trialno)
res$OCC
#[1] 1 2 2 3 4 4 4 5 6 6 1 2 2 3 4 4 4 5 6 6
A.K.
On Thursday, November 21, 2013 2:04 PM, Andrzej Bienczak <andrzej.bienczak at googlemail.com> wrote:
Hi All,
I'm trying to figure out how in my data set to add a column including a
count of unique events based on date. Here is a part of my data set:
trialno event date time
3 11301 pm_intake 2010-11-24 19:00
4 11301 am_intake 2010-11-25 07:00
5 11301 pk1 2010-11-25
10:30
6 11301 pm_intake 2010-12-22 19:00
7 11301 am_intake 2010-12-23 07:00
8 11301 pk1 2010-12-23
09:54
9 11301 pk2 2010-12-23
13:07
10 11301 pm_intake 2011-02-02 19:00
11 11301 am_intake 2011-02-03 07:00
12 11301 pk1 2011-02-03 11:30
Basically each date within each patient would indicate a new occasion. If
patient has just drug administration - it's one occasion but if patient had
drug administration and two measurements on the same day, they all count as
the same occasion. The data set does not have a regular patters (each
patient has a different number of events on each date and events in total).
What I'm trying to achieve is:
trialno event date time
OCC
3 11301 pm_intake 2010-11-24 19:00 1
4 11301 am_intake 2010-11-25 07:00 2
5 11301 pk1 2010-11-25
10:30 2
6 11301 pm_intake 2010-12-22 19:00 3
7 11301 am_intake 2010-12-23 07:00 4
8 11301 pk1 2010-12-23
09:54 4
9 11301 pk2 2010-12-23
13:07 4
10 11301 pm_intake 2011-02-02 19:00 5
11 11301 am_intake 2011-02-03 07:00 6
12 11301 pk1 2011-02-03 11:30
6
I think I should apply some kind of a loop to identify within each patient
unique dates and count them...
I thought about splitting the whole data set into patients using split
function:
splitData<- split(data, data$trialno)
And applying lapply and transform to add a new column OCC (occasion) but I
don't know how to count those as integers...
I was thinking:
splitData<- lapply(splitData, function(df) {
transform(df, OCC= ??????????????? )}
do.call ("rbind", splitData)
I know how to do it in Excell:
=IF(D5=D4, E4,E4+1)
(if the cell value in neighbouring cell is same as in the cell above, then
value in my cell is same as in one above, else it's one greater)-this way
first cell in E column has to be 1 and the others are integers of new date
events.
Help much appreciated!
Andrzej
[[alternative HTML version deleted]]
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list