[R] Cox Proportional Hazard with missing covariate data

Arthur Allignol arthur.allignol at fdm.uni-freiburg.de
Tue May 5 14:51:21 CEST 2009

```Hi,

In fact, you have left-truncated observations.

What timescale do you use, time 0 is the
study entry, or when the wear-part has been used for the
first time?

If it is the latter, you can specify the "age" of the wear part
at study entry in Surv(). For example, if a wear part has been
used for 5 years before study entry, and "dies" 2 years after,
the data will look like that:
start stop status
5    7      1

Hope this helps,
Arthur Allignol

Philipp Rappold wrote:
> Dear friends,
>
> I have used R for some time now and have a tricky question about the coxph-function: To sum it up, I am not sure whether I can use coxph in conjunction with missing covariate data in a model with time-variant covariates. The point is: I know how "old" every piece that I oberserve is, but do not have fully historical information about the corresponding covariates. Maybe you have some advice for me, although this problem might only be 70% R and 30% statistically-related. Here's a detailled explanation:
>
> SITUATION & OBJECTIVE:
> I want to analyze the effect of environmental effects (i.e.
> temperature and humidity) on the lifetime of some wear-parts. The
> study should be conducted on a yearly basis, meaning that I have
> collected empirical data on every wearpart at the end of every year.
>
> DATA:
> I have collected the following data:
> - Status of the wear-part: Equals "0" if part is still alive, equals
> "1" if part has "died" (my event variable)
> - Environmental data: Temperature and humidity have been measured at
> each of the wear-parts on a yearly basis (because each wear-part is at
> a different location, I have different data for each wear-part)
>
> PROBLEM:
> I started collecting data between 2001 and 2007. In 2001, a vast
> amount of of wearparts has already been in use. I DO KNOW for every
> part how long it has been used (even if it was employed before 2001),
> but I DO NOT have any information about environmental conditions like
> temperature or humidity before 2001 (I call this semi-left-censored).
> Of course, one could argue that I should simply exclude these parts
> from my analysis, but I don't want to loose valuable information, also
> because the amount of "new parts" that have been employed between 2001
> and 2007 is rather small.
>
> lifetime distribution. Therefore I have to use a non-parametrical
> model for estimation (most likely cox).
>
> QUESTION:
>>From an econometric perspective, is it possible to use Cox
> Proportional Hazard model in this setting? As mentioned before, I have
> time-variant covariates for each wearpart, as well as what I call
> "semi-left-censored" data that I want to use. If not, what kind of
> analysis would you suggest?
>
> Thanks a lot for your great help, I really appreciate it.
>
> All the best
> Philipp
>
