[R] data recoding problem

Dimitris Rizopoulos dimitris.rizopoulos at med.kuleuven.be
Mon Apr 23 11:11:31 CEST 2007


one option is the following:

do.call(rbind, lapply(split(tox, tox$id), function (x) {
    if (any(ind <- x$event == 1))
        x[which(ind)[1], ]
    else
        x[nrow(x), ]
}))


I hope it helps.

Best,
Dimitris

----
Dimitris Rizopoulos
Ph.D. Student
Biostatistical Centre
School of Public Health
Catholic University of Leuven

Address: Kapucijnenvoer 35, Leuven, Belgium
Tel: +32/(0)16/336899
Fax: +32/(0)16/337015
Web: http://med.kuleuven.be/biostat/
     http://www.student.kuleuven.be/~m0390867/dimitris.htm


----- Original Message ----- 
From: "Williams Scott" <Scott.Williams at petermac.org>
To: <r-help at stat.math.ethz.ch>
Sent: Monday, April 23, 2007 10:14 AM
Subject: [R] data recoding problem


> Hi R experts,
>
> I have a data recoding problem I cant get my head around - I am not 
> that
> great at the subsetting syntax. I have a dataset of longitudinal
> toxicity data (for multistate modelling) for which I want to also 
> want
> to do a simple Kaplan-Meier curve of the time to first toxic event.
>
> The data for 2 cases presently looks like this (one with an event, 
> the
> other without), with id representing each person on study, and 
> follow-up
> time and status:
>
>
>> tox
>
> id      t       event
>
> PMC011  0.000     0
> PMC011  3.154     0
> PMC011  5.914     0
> PMC011 12.353     0
> PMC011 18.103     1
> PMC011 24.312     0
> PMC011 30.029     0
> PMC011 47.967     0
> PMC011 96.953     0
> PMC016  0.000     0
> PMC016  3.943     0
> PMC016  5.782     0
> PMC016 11.762     0
> PMC016 17.741     0
> PMC016 23.951     0
> PMC016 28.353     0
> PMC016 44.747     0
> PMC016 89.692     0
>
> So what I need is an output in the same column format, containing 
> each
> of the unique values of id:
>
> PMC011 18.103     1
> PMC016 89.692     0
>
> In my head, I would do this by looking at each unique value of id 
> (each
> unique case), look down the event data of each of these cases - if 
> there
> is no event (event==0), then I would go to the time column (t) and 
> find
> the max value and paste this time along with a 0 for event. If there
> were an event, I would then need to find the minimum time associated
> with an event to paste across with the event marker. I am sure 
> someone
> out there can point me in the right direction to do this without 
> tedious
> and slow loops. Any help greatly appreciated.
>
> Cheers
>
> Scott
> _____________________________
>
> Dr. Scott Williams
>
> MBBS BScMed FRANZCR
>
> Radiation Oncologist
>
> Peter MacCallum Cancer Centre
>
> Melbourne, Australia
>
> ph +61 3 9656 1111
>
> fax +61 3 9656 1424
>
> scott.williams at petermac.org
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 


Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm



More information about the R-help mailing list