[R] calculating "treatment effects" (differences) in a data frame?
derek eder
derek.eder at lungall.gu.se
Mon May 24 22:28:56 CEST 2010
I am trying to calculating the treatment effect for individual subjects
("ID")
of a ("score") between 2 time-points ("visit") (see example below).
The data is in an unbalanced data.frame in "long" format with some
missing data.
I suspect that I am overlooking a very simple function, something along
the lines of
tapply().
Thank you for you attention!
Derek Eder
## Examples:
myData = data.frame(
ID = c("a","a","b","c","c","d","d"),
visit=c(1,2,1,1,2,1,2),
score=c(10,2,12,16,0,NA,5)
)
> myData
ID visit score
1 a 1 10
2 a 2 2
3 b 1 12
4 c 1 16
5 c 2 0
6 d 1 NA
7 d 2 5
# The desired result is a vector of time differences by ID
# a b c d
# 8 NA 16 NA
## solutions ?
# This works, but the returned data frame is awkward for me
# because the "empty cells" (b and d) contain integer(0)
# and not the more familiar NA.
> aggregate(data=myData, score~ID,FUN=diff)
ID score
1 a -8
2 b
3 c -16
4 d
# This works as desired ... but somehow seems unecessarily complicated
> reshape(data=myData,timevar="visit",idvar="ID", direction="wide")
ID score.1 score.2
1 a 10 2
3 b 12 NA
4 c 16 0
6 d NA 5
> apply(X = reshape(data=myData,timevar="visit",idvar="ID",
direction="wide")[,-1],
MARGIN = 1, FUN = diff)
1 3 4 6
-8 NA -16 NA
More information about the R-help
mailing list