[R] data recoding problem
Williams Scott
Scott.Williams at petermac.org
Mon Apr 23 10:14:20 CEST 2007
Hi R experts,
I have a data recoding problem I cant get my head around - I am not that
great at the subsetting syntax. I have a dataset of longitudinal
toxicity data (for multistate modelling) for which I want to also want
to do a simple Kaplan-Meier curve of the time to first toxic event.
The data for 2 cases presently looks like this (one with an event, the
other without), with id representing each person on study, and follow-up
time and status:
> tox
id t event
PMC011 0.000 0
PMC011 3.154 0
PMC011 5.914 0
PMC011 12.353 0
PMC011 18.103 1
PMC011 24.312 0
PMC011 30.029 0
PMC011 47.967 0
PMC011 96.953 0
PMC016 0.000 0
PMC016 3.943 0
PMC016 5.782 0
PMC016 11.762 0
PMC016 17.741 0
PMC016 23.951 0
PMC016 28.353 0
PMC016 44.747 0
PMC016 89.692 0
So what I need is an output in the same column format, containing each
of the unique values of id:
PMC011 18.103 1
PMC016 89.692 0
In my head, I would do this by looking at each unique value of id (each
unique case), look down the event data of each of these cases - if there
is no event (event==0), then I would go to the time column (t) and find
the max value and paste this time along with a 0 for event. If there
were an event, I would then need to find the minimum time associated
with an event to paste across with the event marker. I am sure someone
out there can point me in the right direction to do this without tedious
and slow loops. Any help greatly appreciated.
Cheers
Scott
_____________________________
Dr. Scott Williams
MBBS BScMed FRANZCR
Radiation Oncologist
Peter MacCallum Cancer Centre
Melbourne, Australia
ph +61 3 9656 1111
fax +61 3 9656 1424
scott.williams at petermac.org
More information about the R-help
mailing list