[R] Clustering over strata using a Cox proportional hazard model

Thomas Lumley tlumley at u.washington.edu
Fri Mar 24 18:11:47 CET 2006

On Fri, 24 Mar 2006, May, Roel wrote:

> Hi all,
> I wish to do build discrete choice model to analyse habitat selection of
> wolverines.
> This can be done with a 'tricked' stratified Cox proportional hazard
> model.
> For each individual animal each selected position, and possible
> alternative non-used available positions are combined into strata.
> This means that one stratum contains a set of 1 used position and
> several positions which were available to the animal but were not
> selected.
> Ultimately this renders unique choice sets for all observations in the
> dataset.
> The stratified model works fine and does its work as should.

You could also have used clogit(), which does exactly this.

> The problem however is that, having checked the residuals, there is a
> high variation between individuals.
> Is it somehow possible to account for preferences that vary among
> individuals? I am thinking along the lines of clustering the data over
> the strata or using specific individual weights. I have looked into the
> cluster() function, but this does not result in any differences in the
> residuals. If using the cluster() function is the right way to take, how
> can I check if it worked to remove individual preferences?

cluster() affects only the standard error computations.  It uses the 
Huber/White "sandwich" standard error estimates, which consistently 
estimate the actual standard error of your estimate. You can think of them 
as a bargain-basement version of bootstrapping.

In some situations this is all you need, because the correlation does not 
prevent a sensible interpretation of your regression coefficients.  In 
other cases you want a different model that will give different regression 


Thomas Lumley			Assoc. Professor, Biostatistics
tlumley at u.washington.edu	University of Washington, Seattle

More information about the R-help mailing list