# [R] calculating "treatment effects" (differences) in a data frame?

derek eder derek.eder at lungall.gu.se
Mon May 24 22:28:56 CEST 2010

```I am trying to  calculating the treatment effect for individual subjects
("ID")
of a ("score") between 2 time-points ("visit") (see example below).

The data is in an unbalanced data.frame in "long" format with some
missing data.

I suspect that I am overlooking a very simple function, something along
the lines of
tapply().

Thank you for you attention!

Derek Eder

##  Examples:

myData = data.frame(
ID = c("a","a","b","c","c","d","d"),
visit=c(1,2,1,1,2,1,2),
score=c(10,2,12,16,0,NA,5)
)

> myData
ID visit score
1  a     1    10
2  a     2     2
3  b     1    12
4  c     1    16
5  c     2     0
6  d     1    NA
7  d     2     5

# The desired result is a vector of time differences by ID
#  a  b  c  d
#  8  NA 16 NA

##  solutions ?

# This works, but the returned data frame is awkward for me
# because the "empty cells" (b and d) contain integer(0)
# and not the more familiar NA.

> aggregate(data=myData, score~ID,FUN=diff)
ID score
1  a    -8
2  b
3  c   -16
4  d

# This works as desired ... but somehow seems unecessarily complicated

> reshape(data=myData,timevar="visit",idvar="ID", direction="wide")
ID score.1 score.2
1  a      10       2
3  b      12      NA
4  c      16       0
6  d      NA       5

> apply(X = reshape(data=myData,timevar="visit",idvar="ID",
direction="wide")[,-1],
MARGIN = 1, FUN = diff)

1   3   4   6
-8  NA -16  NA

```