[R] help with handling replicates before reshaping data
hadley wickham
h.wickham at gmail.com
Fri Jul 13 20:25:54 CEST 2007
Hi Tom,
> I have a dataset consists of duplicated sequences within day for each patient (see below data) and I want to reshape the data with patient as time variable. However the reshape function only takes the first sequence of the replicates and ignores the second. How can I 1) average the duplicates and 2) give the duplicated sequences unique names before reshaping the data ?
>
> > data
> patient day seq y
> 1 10 1 acdf -0.52416066
> 2 10 1 cdsv 0.62551539
> 3 10 1 dlfg -1.54668047
> 4 10 1 acdf 0.82404978
> 5 10 1 cdsv -1.17459914
> 6 10 2 acdf 0.47238216
You mind find that the functions in the reshape package give you a bit
more flexibility.
# The reshape package expects data like to have
# the value variable named "value"
d2 <- rename(data, c("y" = "value"))
# I think this is the format you want, which will average over the reps
cast(d2, day + seq ~ patient, mean)
Hadley
More information about the R-help
mailing list