[R] AFTREG with ID argument

Philipp Rappold philipp.rappold at gmail.com
Thu Feb 18 18:09:06 CET 2010


Göran, no worries - your help & advice is already invaluable!

Göran Broström wrote:
> 2010/2/18 Philipp Rappold <philipp.rappold at gmail.com>:
>> Göran, David,
>>
>> in order to adapt aftreg to my needs I wrote a little function that I would
>> like to share with you and the community.
> 
> I once promised to fix this 'asap'. Now I promise to do it tonight. OK?
> 
> Göran
> 
>>
>> WHAT DOES IT FIX?
>>
>> (1) Using the id-argument in combination with missing values on covariates
>> wasn't easily possible before because the id-dataframe and the
>> data-dataframe had different sizes and aftreg quitted with an error. My fix
>> makes sure that NAs are excluded from both dataframes and aftreg will run
>> without error here.
>>
>> (2) The id-argument was required to be specified by its "absolute path" (eg.
>> id=testdata$groupvar, see below in this thread). My adapted funtion takes
>> the name of the id-variable as a string, eg. id="groupvar".
>>
>>
>> HOW DOES IT WORK?
>>
>> Use function aftreg2 just like you would use aftreg. Mandatory arguments
>> are: formula, data and id, where id is a string variable. Example:
>>
>>> testdata
>>  start stop censor groupvar      var1
>> 1     0    1      0        1 0.1284928
>> 2     1    2      0        1 0.4896125
>> 3     2    3      0        1 0.7012899
>> 4     3    4      0        1        NA
>> 5     0    1      0        2 0.7964361
>> 6     1    2      0        2 0.8466039
>> 7     2    3      1        2 0.2234271
>>
>> model1 <- aftreg(Surv(start, stop, censor)~var1, data=testdata, id=groupvar)
>>> ERROR.
>> model2 <- aftreg2(Surv(start, stop, censor)~var1, data=testdata,
>> id="groupvar")
>>> WORKS FINE.
>>
>> PREREQUISITES:
>>
>> (1) Make sure that missing values are only present at the end of a lifetime.
>> The regression will yield false results if you have missing covariate data
>> in the middle of a lifetime. For instance: known covariates from liftetime
>> 0-10, 13-20, but not from 11-12. (Göran: Please correct me if I'm wrong
>> here!).
>>
>> (2) If you have missing covariate data at the beginning of a lifetime (eg.
>> missing from 0-5, but present from 6-censoring), this fix will yield false
>> results if one _cannot_ assume that the missing covariates were the same
>> from 0-5 as they were at 6. (Göran: Please correct me again if I'm wrong
>> here with my interpretation, but that's basically what you said before)
>>
>>
>> LISTING:
>>
>> aftreg2 <- function(formula, data, id, ...){
>>
>>        call <- match.call()
>>
>>        non_na_cols <- attr(attr(terms(formula), "factors"),
>> "dimnames")[2][[1]]
>>
>>        data <- data[complete.cases(data[non_na_cols]),]
>>        data <- data[complete.cases(data[id]),]
>>
>>        cat("Original Call: ")
>>        print(call)
>>
>>        return(aftreg(formula=formula, data=data, id=data[,id], ...))
>> }
>>
>>
>> Hope someone finds this interesting.
>>
>> All the best
>> Philipp
>>
>>
>> David Winsemius wrote:
>>> On Feb 11, 2010, at 5:58 AM, Philipp Rappold wrote:
>>>
>>>> Göran, thanks!
>>>>
>>>> One more thing that I found: As soon as you have at least one NA in the
>>>> independent vars, the trick that you mentioned does not work anymore.
>>>> Example:
>>>>
>>>>> testdata
>>>>  start stop censor groupvar      var1
>>>> 1     0    1      0        1 0.1284928
>>>> 2     1    2      0        1 0.4896125
>>>> 3     2    3      0        1 0.7012899
>>>> 4     3    4      0        1        NA
>>>> 5     0    1      0        2 0.7964361
>>>> 6     1    2      0        2 0.8466039
>>>> 7     2    3      1        2 0.2234271
>>>>
>>>>> aftreg(Surv(start, stop, censor)~var1, data=testdata,
>>>>> id=testdata$groupvar)
>>>> Error in order(id, Y[, 1]) : Different length of arguments (* I
>>>> translated this from the German Output *)
>>>>
>>>> Do you think there is a simple hack which excludes all subjects that have
>>>> at least on NA in their independent vars? If it was only one dependent var
>>>> it would probably be easy by just using subset, but I have lots of different
>>>> combinations of vars that I'd like to test ;)
>>>>
>>> I don't know if it's a "hack", but there are a set of functions that
>>> perform such subsetting:
>>>
>>> ?na.omit
>>>
>>> There is a parameter that would accomplish that goal inside aftreg. You
>>> may want to check what your defaults are for na.action.
>>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>



More information about the R-help mailing list