# [R] survey package: weights used in svycoxph()

Thomas Lumley tlumley at u.washington.edu
Tue May 18 17:50:30 CEST 2010

On Mon, 17 May 2010, Vinh Nguyen wrote:

> Dear R-help,
>
> Let me know if I should email r-devel instead of this list.  This
> message is addressed to Professor Lumley or anyone familiar with the
> survey package.
>
> Does svycoxph() implement the method outlined in Binder 1992 as
> referenced in the help file?

Yes. That's why it's referenced.

>That is, are weights incorporated in the
> ratio term (numerator and denominator) of the estimating equation?

Yes.

> I
> don't believe so since svycoxph() calls coxph() of the survival
> package and weights are applied once in the estimating equation.  If
> the weights are implemented in the ratio, could you point me to where
> in the code this is done?  I would like to estimate as in Binder but
> with custom weights.  Thanks.

It happens inside the C code called by coxph(), eg, in survival/src/coxfit2.c

Binder's estimating equations are the usual way of applying weights to a Cox model, so nothing special is done apart from calling coxph(). To quote the author of the survival package, Terry Therneau, "Other formulae change in the obvious way, eg, the weighted mean $\bar Z$ is changed to include both the risk weights $r$ and the external weights $w$." [Mayo Clinic Biostatistics technical report #52, section 6.2.2]

> This is mentioned in the help file but I don't quite understand:
> The main difference between svycoxph function and the robust=TRUE
> option to coxph in the
> survival package is that this function accounts for the reduction in
> variance from stratified sampling
> and the increase in variance from having only a small number of clusters.

The point estimates from coxph() are the same as those from svycoxph() (with the same weights).  The standard errors are almost the same.  There are two differences.  The first is the use of 1/(nclusters -1) rather than 1/nclusters as a divisor.  The second is that svycoxph() computes variances using estimating functions centered at zero in each *sampling* stratum whereas coxph() centers them at zero in each baseline hazard stratum, as supplied in the strata() argument to coxph().

-thomas

Thomas Lumley			Assoc. Professor, Biostatistics
tlumley at u.washington.edu	University of Washington, Seattle