[R] calcul of the mean in a period of time

arun smartpink111 at yahoo.com
Wed May 22 15:57:48 CEST 2013


Hi,
I guess you meant this:


dat2<- read.table(text="
patient_id      t         scores
1                      0                1.6
1                      1                2.6
1                      2                 2.2
1                      3                 1.8
2                      0                  2.3
2                       2                 2.5
2                      4                  2.6
2                       5                 1.5
3                       0                 1.2
4                       0                 1.3
4                       1                 1.8
",sep="",header=TRUE)

library(plyr)
 dat2New<-ddply(dat2,.(patient_id),summarize,t=seq(min(t),max(t)))
 res<-join(dat2New,dat2,type="full")

 lst1<-lapply(split(res,res$patient_id),function(x) {x1<-x[x$t!=0,];do.call(rbind,lapply(split(x1,((x1$t-1)%/%3)+1),function(y) {y1<-if(any(y$t==1)) rbind(x[x$t==0,],y) else y; data.frame(patient_id=unique(y1$patient_id),t=head(y1$t,1),scores=mean(y1$scores,na.rm=TRUE))}) ) })

lst1[lapply(lst1,length)==0]<-lapply(lst1[lapply(lst1,length)==0],function(x) x<- dat2[unlist(with(dat2,tapply(t,patient_id,FUN=function(x) x==0 & length(x)==1)),use.names=FALSE),])
res1<-do.call(rbind,lst1)
 row.names(res1)<- 1:nrow(res1)
 res2<- res1[,-2]
res2$period<-with(res2,ave(patient_id,patient_id,FUN=seq_along))
 res2
# patient_id scores period
#1          1   2.05      1
#2          2   2.40      1
#3          2   2.05      2
#4          3   1.20      1
#5          4   1.55      1
A.K.

________________________________
From: GUANGUAN LUO <guanguanluo at gmail.com>
To: arun <smartpink111 at yahoo.com> 
Sent: Wednesday, May 22, 2013 5:42 AM
Subject: calcul of the mean in a period of time



Hello, AK, This is the code zhich you have written.

dat2<- read.table(text="

patient_id      t         scores
1                      0                1.6
1                      1                2.6
1                      2                 2.2
1                      3                 1.8
2                      0                  2.3
2                       2                 2.5
2                      4                  2.6
2                       5                 1.5
",sep="",header=TRUE)

library(plyr)
 dat2New<-ddply(dat2,.(
patient_id),summarize,t=seq(min(t),max(t)))
 res<-join(dat2New,dat2,type="full")
res1<-do.call(rbind,lapply(split(res,res$patient_id),function(x) {x1<-x[x$t!=0,];do.call(rbind,lapply(split(x1,((x1$t-1)%/%3)+1),function(y) {y1<-if(any(y$t==1)) rbind(x[x$t==0,],y) else y; data.frame(patient_id=unique(y1$patient_id),scores=mean(y1$scores,na.rm=TRUE))}) ) }))
 row.names(res1)<-1:nrow(res1)
res1$period<-with(res1,ave(patient_id,patient_id,FUN=seq))
 res1
#  patient_id scores period
#1          1   2.05      1
#2          2   2.40      1
#3          2   2.05      2


 for the same problem, in the case that you have done, you have select the data x[t!=0], if there are some patients who have only one data when t=0, can i change a little the code so that i can retain the informations when t=0?
That means when the patients have only one score, so i regarde the score of t=0 as the average of period 1 for these patients.
Thank you so much for your help. I have never worked on programming before, so i really don't understand much for it. 
You are really helpful. Thank you so much.

GG



More information about the R-help mailing list