[R] Assign date according to defined time interval
arun
smartpink111 at yahoo.com
Wed Oct 16 20:58:44 CEST 2013
Hi Weijia,
This will give you the rownames of the split variables.
lst1 <- split(a,list(a$COUNTRY,a$SITEID))
res <- t(sapply(lst1,function(x) {
x$SCRNDT <- as.Date(x$SCRNDT, "%d-%b-%y")
unlist(lapply(split(b,b$TMPT),function(y){
sum(x$SCRNDT >= y$DT_ETP & x$SCRNDT <= y$DT_END)
}))
}))
colnames(res) <- paste0("WEEK",colnames(res))
res[5:8,1:4]
# WEEK1 WEEK2 WEEK3 WEEK4
#USA.5 0 0 0 4
#USA.6 0 0 0 1
#USA.8 0 0 0 0
#USA.9 0 0 0 0
A.K.
On , arun <smartpink111 at yahoo.com> wrote:
Yes, Sorry a typo. I was in a hurry when I sent it.
On Wednesday, October 16, 2013 2:32 PM, Weijia Wang <zeleehom at gmail.com> wrote:
Arun
I think you mean
sum(x$SCRNDT>= y$DT_ETP & x$SCRNDT <= y$DT_END)
instead of DT_ETP at the end of this line, right?
W
2013/10/16 arun <smartpink111 at yahoo.com>
HI Weijia,
>
>Please check whether this is what you wanted.
>
>Weijia <- load("/home/arunksa111/Downloads/arun_help.RData" )
>a[sapply(a,is.factor)] <-lapply(a[sapply(a,is.factor)],as.character)
>str(a)
> b[sapply(b,is.factor)] <- lapply(b[sapply(b,is.factor)],as.character)
>str(b)
> b$DT_ETP <- as.Date(b$DT_ETP,"%d-%b-%y")
> b$DT_END <- c(b$DT_ETP[-1]-1, b$DT_ETP[length(b$DT_ETP)]+6)
>res <- do.call(rbind,lapply(split(a,list(a$COUNTRY,a$SITEID),drop=TRUE),function(x){
>x$SCRNDT <- as.Date(x$SCRNDT, "%d-%b-%y")
>do.call(cbind,lapply(split(b,b$TMPT),function(y) {
> sum(x$SCRNDT>= y$DT_ETP & x$SCRNDT <= y$DT_ETP)
>}))
> }))
>
> colnames(res) <- paste0("WEEK",colnames(res))
>
>
>
>A.K.
>
>
>
>
>
>On Wednesday, October 16, 2013 10:48 AM, Weijia Wang <zeleehom at gmail.com> wrote:
>
>Hi, Arun
>
>Here I attached a R object with two dataframes.
>
>The first one is the one I need to count the number of dates that fall into a certain week interval.
>
>For example,
>
> STUDYID COUNTRY SITEID SUBJID SCRNDT
>1 GRTMD101 USA 13 130101 4-Dec-12
>2 GRTMD101 USA 13 130102 4-Dec-12
>3 GRTMD101 USA 13 130103 4-Dec-12
>4 GRTMD101 USA 6 60101 5-Dec-12
>5 GRTMD101 USA 5 50101 5-Dec-12
>6 GRTMD101 USA 13 130104 6-Dec-12
>
>So I will need to count number so dates under 'SCRNDT' for each unique 'SITEID' that falls into the
>following 'DT_ETP'
>
> STUDYID DT_ETP TMPT
>1 GRTMD101 9-Nov-12 1
>2 GRTMD101 16-Nov-12 2
>3 GRTMD101 23-Nov-12 3
>4 GRTMD101 30-Nov-12 4
>5 GRTMD101 7-Dec-12 5
>6 GRTMD101 14-Dec-12 6
>
>For example, for site 13, the 1st and 2nd SCRNDT both are 4-Dec-12 that falls into the week from '30-Nov-12' to '7-Dec-12', then in the result dataset, it should say, 2 under the variable 'week 4', and the result data frame should look like this:
>
>SITEID WEEK1 WEEK2 WEEK3 WEEK4.......WEEK64 (The values are just an example)
> 1 0 0 1 2 3
> 2 1 2 3 5 6
> . . . . . .
>
>Here is a code I modified based on the one you sent me, but quite intuitively, and R returns an empty vector to me, lol,
>
>
> library(plyr)
> res <- ddply(a,.(COUNTRY, SITEID), function(x) {
> x$SCRNDT <-as.Date(x$SCRNDT,"%d-%b-%y")
> unsplit(lapply(split(b,b$TMPT),function(y) {
> numweek<-as.numeric(length(unique(b$TMPT)))
> for (i in 1:numweek) {
> y$DT_ETP <- as.Date(y$DT_ETP, "%d-%b-%y")
> sum(x$SCRNDT > y$DT_ETP[i] & x$SCRNDT <= y$DT_ETP[i+1])}}),
> b$TMPT)})
>
>
>
>
>I really hope you can teach me on this one again! Thank you so much!!
>
>W
>
>
>
>
>
>2013/10/15 arun <smartpink111 at yahoo.com>
>
>HI Weijia,
>>No problem.
>>Let me know if it works. One slight modification as I noticed that your weeks are not overlapping
>>#change
>>
>>
>>
>>sum(x$Date > y$from & x$Date <= y$to)
>>
>>#to
>>sum(x$Date >= y$from & x$Date <= y$to
>>
>>Regards,
>>Arun
>>
>>
>>
>>
>>
>>
>>On Tuesday, October 15, 2013 8:14 PM, Weijia wang <zeleehom at gmail.com> wrote:
>>Thank you Arun! I will try it and get back to you! You are amazing!
>>
>>Weijia Wang
>>
>>
>>> On Oct 15, 2013, at 8:00 PM, arun <smartpink111 at yahoo.com> wrote:
>>>
>>> Hi,
>>>
>>> Please use ?dput() to show the dataset. Also, it is not clear about how you store the time interval.
>>> dat <- read.table(text="
>>> GroupID Date
>>> 1 1 10-Dec-12
>>> 2 1 11-Dec-12
>>> 3 2 13-Dec-12
>>> 4 2 15-Dec-12
>>> 5 3 06-Dec-12
>>> 6 3 19-Dec-12",sep="",header=TRUE,stringsAsFactors=FALSE)
>>>
>>>
>>>
>>> dat2 <- data.frame(Week=1:2, from= c("9-Dec-12", "16-Dec-12"), to=c("15-Dec-12","22-Dec-12"),stringsAsFactors=FALSE)
>>>
>>> #Check ?findInterval()
>>>
>>> res <- t(sapply(split(dat,dat$GroupID), function(x) {
>>> x$Date <-as.Date(x$Date,"%d-%b-%y")
>>> unsplit(lapply(split(dat2,dat2$Week),function(y) {
>>> y$from <- as.Date(y$from, "%d-%b-%y")
>>> y$to <- as.Date(y$to, "%d-%b-%y")
>>> sum(x$Date > y$from & x$Date <= y$to)}),
>>> dat2$Week)
>>> }))
>>>
>>> colnames(res) <- paste0("Week",1:2)
>>> res
>>> # Week1 Week2
>>> #1 2 0
>>> #2 2 0
>>> #3 0 1
>>>
>>>
>>> A.K.
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Tuesday, October 15, 2013 6:24 PM, Weijia Wang <zeleehom at gmail.com> wrote:
>>> Hi, I have something very interesting:
>>>
>>> Say I have this:
>>>
>>> GroupID Date
>>> 1 1 10-Dec-12
>>> 2 1 11-Dec-12
>>> 3 2 13-Dec-12
>>> 4 2 15-Dec-12
>>> 5 3 06-Dec-12
>>> 6 3 19-Dec-12
>>>
>>> Now, I have time interval,
>>>
>>> week 1: from 9-Dec-12 to 15-Dec-12,
>>>
>>> week 2: from 16-Dec-12 to 22-Dec-12, and so on.
>>>
>>> Obviously, the 1st, 2nd, 3rd, 4th row falls to week 1, 5th rows should not
>>> be counted, 6th row falls into week2.
>>>
>>> Therefore, by GroupID, I will have
>>>
>>> GroupID=1, Week1=2, Week2=0
>>> GroupID=2, Week1=2, Week2=0
>>> GroupID=3, Week1=0, Week2=1.
>>>
>>> I just want to count the valid date that falls into a 7-day week interval,
>>> and I shall have new variables for EACH WEEK, and the counts for dates that
>>> fall into this week interval.
>>>
>>> Can anyone please help me on programming this?
>>>
>>> W
>>>
>>> [[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>
>
>--
>
>--Sent via Gmail
>
>Weijia Wang
>Division of Epidemiology
>
>Department of Population Health, School of Medicine
>New York University, NY, 10016
>
--
--Sent via Gmail
Weijia Wang
Division of Epidemiology
Department of Population Health, School of Medicine
New York University, NY, 10016
More information about the R-help
mailing list