[R] Survival::coxph (clogit), survConcordance vs. summary(fit) concordance

Andrews, Chris chrisaa at med.umich.edu
Wed Jan 20 14:11:21 CET 2016


I only get the digest, sorry if this has already been answered.

When I run your code (after creating some data) I get a warning that "weights are ignored in clogit".  This is a result of miscalling the clogit function.  The first 2 commas should be +s.

library(survival)
nn <- 1000
dat <- data.frame(resp = rbinom(nn, 1, 0.5), x1=rnorm(nn), x2=rnorm(nn), ID = rep(seq(nn/2), e=2), site = rep(seq(nn/10), e=10))
fit <- clogit(resp ~ x1 + x2, strata(ID), cluster(site), method ="efron", data = dat) # warning
fit <- clogit(resp ~ x1 + x2 + strata(ID) + cluster(site), method ="efron", data = dat) # no warning
summary(fit)

Chris

-----Original Message-----
From: Joe Ceradini [mailto:joeceradini at gmail.com] 
Sent: Tuesday, January 19, 2016 12:48 PM
To: r-help at r-project.org
Subject: [R] Survival::coxph (clogit), survConcordance vs. summary(fit) concordance

Hi,

I'm running conditional logistic regression with survival::clogit. I have
"1-1 case-control" data, i.e., there is 1 case and 1 control in each strata.

Model:
fit <- clogit(resp ~ x1 + x2, strata(ID), cluster(site), method ="efron",
data = dat)
Where resp is 1's and 0's, and x1 and x2 are both continuous.

Predictors are both significant. A snippet of summary(fit):
Concordance= 0.763  (se = 0.5 )
Rsquare= 0.304   (max possible= 0.5 )
Likelihood ratio test= 27.54  on 2 df,   p=1.047e-06
Wald test            = 17.19  on 2 df,   p=0.0001853
Score (logrank) test = 17.43  on 2 df,   p=0.0001644,   Robust = 6.66
 p=0.03574

The concordance estimate seems good but the SE is HUGE.

I get a very different estimate from the survConcordance function, which I
know says computes concordance for a "single continuous covariate", but it
runs on my model with 2 continuous covariates....

survConcordance(Surv(rep(1, 76L), resp) ~ predict(fit), dat)
n= 76
Concordance= 0.9106648 se= 0.09365047
concordant  discordant   tied.risk   tied.time    std(c-d)
 1315.0000   129.0000     0.0000   703.0000   270.4626

Are both of these concordance estimates valid but providing different
information?
Is one more appropriate for measuring "performance" (in the AUC sense) of
conditional logistic models?
Is it possible that the HUGE SE estimate represents a convergence problem
(no warnings were thrown when fit the model), or is this model just useless?

Thanks!
-- 
Cooperative Fish and Wildlife Research Unit
Zoology and Physiology Dept.
University of Wyoming
JoeCeradini at gmail.com / 914.707.8506
wyocoopunit.org

	[[alternative HTML version deleted]]


**********************************************************
Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues 


More information about the R-help mailing list