[R] Outlier Problem in Survreg Function
Vipul Agarwal
iitkvipul at gmail.com
Sun Jul 25 08:24:40 CEST 2010
Hi Everyone,
I have recently started using r and working on survival analysis using the
function survreg.
I am facing a trange problem. One of the covariates in my analysis has
outliers because of which survreg is giving incorrect results. Howevere when
I am removing the outliers or scaling down the values of the covariate by a
factor of 2 it is giving correct results. Below is a ditribution of the
ariable and the results
Min. 1st Qu. Median Mean 3rd Qu. Max.
0 30000 54500 95450 123000 1650000
Survreg Resuts
survreg(formula = Surv(TIME_TO_FAILURE, CENSOR_DEFAULT) ~ ADVANCE,
data = data)
Coefficients:
(Intercept) ADVANCE
0.000000 -6.385336
Scale= 0.9785933
Loglik(model)= -40227366 Loglik(intercept only)= -914141
Chisq= -78626451 on 1 degrees of freedom, p= 1
n=198099 (885 observations deleted due to missingness)
Survreg Results after scaling down the variable by 10
survreg(formula = Surv(TIME_TO_FAILURE, CENSOR_DEFAULT) ~ ADVANCE_SCALED,
data = data)
Coefficients:
(Intercept) ADVANCE_SCALED
4.132962e+00 -2.181577e-05
Scale= 0.9428758
Loglik(model)= -909139.4 Loglik(intercept only)= -914141
Chisq= 10003.19 on 1 degrees of freedom, p= 0
n=198099 (885 observations deleted due to missingness)
Survreg Results Afte removing the outliers(5% of the obs)
data <- subset(data,data$ADVANCE <= 200000)
> survreg(Surv(TIME_TO_FAILURE,CENSOR_DEFAULT) ~ ADVANCE , data = data )
Call:
survreg(formula = Surv(TIME_TO_FAILURE, CENSOR_DEFAULT) ~ ADVANCE,
data = data)
Coefficients:
(Intercept) ADVANCE
4.224298e+00 -3.727171e-06
Scale= 0.9601186
Loglik(model)= -822521.9 Loglik(intercept only)= -825137.1
Chisq= 5230.49 on 1 degrees of freedom, p= 0
n=177332 (444 observations deleted due to missingness)
Please let me know if someone else has faced the same problem and what is
the way around to deal with it ? Should I scale down the variable or remove
the outliers?
--
View this message in context: http://r.789695.n4.nabble.com/Outlier-Problem-in-Survreg-Function-tp2301422p2301422.html
Sent from the R help mailing list archive at Nabble.com.
More information about the R-help
mailing list