[R] plotting and coloring longitudinal data with three time points (ggplot2)

Eric Fail eric.fail at gmx.us
Thu Dec 8 05:07:54 CET 2011


Thank you for solving my problem, it worked out beautifully.

This was exactly what I was looking for, the ggplot2 package keeps
impressing me.

Thanks,
Eric

On Wed, Dec 7, 2011 at 6:01 AM, Hadley Wickham <hadley at rice.edu> wrote:
> On Wed, Dec 7, 2011 at 4:02 AM, Eric Fail <eric.fail at gmx.us> wrote:
>>  Dear list,
>>
>> I have been struggling with this for some time now, and for the last hour I have been struggling to make a working example for the list. I hope someone out there have some experience with plotting longitudinal data that they will share.
>>
>> My data is some patient data with three different time stamps. First the patients are identified at different times (first time stamp). Second, they go through an assessment phase and begin their treatment (time stamp 2). Finally they are admitted from the hospital at some point (time stamp 3),
>>
>> I would like to make a spaghetti plot with the assessment phase in one color and the treatment phase in another color.
>>
>> I used ggplot2, and with this example data and only two time points; it works fine (I call it my working example),
>>
>> library(ggplot2)
>> df <- data.frame(
>>   date = seq(Sys.Date(), len=104, by="1 day")[sample(104, 52)],
>>    patient = factor(rep(1:26, 2), labels = LETTERS)
>>  )
>> df <- df[order(df$date), ]
>> dt <- qplot(date, patient, data=df, geom="line")
>> dt + scale_x_date()
>> df[ which(df$patient=='E'), c("patient", "date")]
>>
>> But, if I have three time points, R, for some reason I do not yet understand, add the two second time points in some funny way.
>>
>> Finally, when that is solved; how do I colorize the different parts of the line so the assessment phase gets one color and the treatment phase another?
>>
>> I want to be able to show how long we have been in contact with our patients, how much of the contact time that was assessment and how much that was actual treatment.
>>
>> Below is an example (I call it the not-working example)
>>
>> df2 <- data.frame(
>>   date2 = seq(Sys.Date(), len= 156, by="2 day")[sample(156, 78)],
>>   patient2 = factor(rep(1:26, 3), labels = LETTERS)
>>  )
>>
>> df2 <- df2[order(df2$date2), ]
>> dt2 <- qplot(date2, patient2, data=df2, geom="line")
>> dt2 + scale_x_date(major="months", minor="weeks")
>> df2[ which(df2$patient2=='B'), c("patient2", "date2")]
>
> Did you mean something like this?
>
> library(ggplot2)
> library(plyr)
>
> df2 <- data.frame(
>  date2 = seq(Sys.Date(), len= 156, by="2 day")[sample(156, 78)],
>  patient2 = factor(rep(1:26, 3), labels = LETTERS)
> )
>
> df2 <- ddply(df2, "patient2", mutate, visit = order(date2))
>
> qplot(date2, patient2, data = df2, geom = "line") +
>  geom_point(aes(colour = factor(visit)))
>
> # or this?
>
> library(ggplot2)
> library(plyr)
>
> df2 <- data.frame(
>  date2 = seq(Sys.Date(), len= 156, by="2 day")[sample(156, 78)],
>  patient2 = factor(rep(1:26, 3), labels = LETTERS)
> )
>
> df2 <- ddply(df2, "patient2", mutate, visit = order(date2))
>
> qplot(date2, patient2, data = df2, geom = "line", colour =
> factor(visit), group = patient2)
>
> # Obviously the lines are drawn between the observations so you only
> see the first two visits.
>
> Hadley
>
> --
> Assistant Professor / Dobelman Family Junior Chair
> Department of Statistics / Rice University
> http://had.co.nz/



More information about the R-help mailing list