[R] Help understanding why glm and lrm.fit runs with my data, but lrm does not
Bonnett, Laura
L.J.Bonnett at liverpool.ac.uk
Thu Sep 14 09:30:13 CEST 2017
Dear all,
I am using the publically available GustoW dataset. The exact version I am using is available here: https://drive.google.com/open?id=0B4oZ2TQA0PAoUm85UzBFNjZ0Ulk
I would like to produce a nomogram for 5 covariates - AGE, HYP, KILLIP, HRT and ANT. I have successfully fitted a logistic regression model using the "glm" function as shown below.
library(rms)
gusto <- spss.get("GustoW.sav")
fit <- glm(DAY30~AGE+HYP+factor(KILLIP)+HRT+ANT,family=binomial(link="logit"),data=gusto,x=TRUE,y=TRUE)
However, my review of the literature and other websites suggest I need to use "lrm" for the purposes of producing a nomogram. When I run the command using "lrm" (see below) I get an error message saying:
Error in lrm(DAY30 ~ AGE + HYP + KILLIP + HRT + ANT, gusto2) :
Unable to fit model using "lrm.fit"
My code is as follows:
gusto2 <- gusto[,c(1,3,5,8,9,10)]
gusto2$HYP <- factor(gusto2$HYP, labels=c("No","Yes"))
gusto2$KILLIP <- factor(gusto2$KILLIP, labels=c("1","2","3","4"))
gusto2$HRT <- factor(gusto2$HRT, labels=c("No","Yes"))
gusto2$ANT <- factor(gusto2$ANT, labels=c("No","Yes"))
var.labels=c(DAY30="30-day Mortality", AGE="Age in Years", KILLIP="Killip Class", HYP="Hypertension", HRT="Tachycardia", ANT="Anterior Infarct Location")
label(gusto2)=lapply(names(var.labels),function(x) label(gusto2[,x])=var.labels[x])
ddist = datadist(gusto2)
options(datadist='ddist')
fit1 <- lrm(DAY30~AGE+HYP+KILLIP+HRT+ANT,gusto2)
Error in lrm(DAY30 ~ AGE + HYP + KILLIP + HRT + ANT, gusto2) :
Unable to fit model using "lrm.fit"
Online solutions to this problem involve checking whether any variables are redundant. However, the results for my data suggest that none are.
redun(~AGE+HYP+KILLIP+HRT+ANT,gusto2)
Redundancy Analysis
redun(formula = ~AGE + HYP + KILLIP + HRT + ANT, data = gusto2)
n: 2188 p: 5 nk: 3
Number of NAs: 0
Transformation of target variables forced to be linear
R-squared cutoff: 0.9 Type: ordinary
R^2 with which each variable can be predicted from all other variables:
AGE HYP KILLIP HRT ANT
0.028 0.032 0.053 0.046 0.040
No redundant variables
I've also tried just considering "lrm.fit" and that code seems to run without error too:
lrm.fit(cbind(gusto2$AGE,gusto2$KILLIP,gusto2$HYP,gusto2$HRT,gusto2$ANT),gusto2$DAY30)
Logistic Regression Model
lrm.fit(x = cbind(gusto2$AGE, gusto2$KILLIP, gusto2$HYP, gusto2$HRT,
gusto2$ANT), y = gusto2$DAY30)
Model Likelihood Discrimination Rank Discrim.
Ratio Test Indexes Indexes
Obs 2188 LR chi2 233.59 R2 0.273 C 0.846
0 2053 d.f. 5 g 1.642 Dxy 0.691
1 135 Pr(> chi2) <0.0001 gr 5.165 gamma 0.696
max |deriv| 4e-09 gp 0.079 tau-a 0.080
Brier 0.048
Coef S.E. Wald Z Pr(>|Z|)
Intercept -13.8515 0.9694 -14.29 <0.0001
x[1] 0.0989 0.0103 9.58 <0.0001
x[2] 0.9030 0.1510 5.98 <0.0001
x[3] 1.3576 0.2570 5.28 <0.0001
x[4] 0.6884 0.2034 3.38 0.0007
x[5] 0.6327 0.2003 3.16 0.0016
I was therefore hoping someone would explain why the "lrm" code is producing an error message, while "lrm.fit" and "glm" do not. In particular I would welcome a solution to ensure I can produce a nomogram.
Kind regards,
Laura
Dr Laura Bonnett
NIHR Post-Doctoral Fellow
Department of Biostatistics,
Waterhouse Building, Block F,
1-5 Brownlow Street,
University of Liverpool,
Liverpool,
L69 3GL
0151 795 9686
L.J.Bonnett at liverpool.ac.uk
[[alternative HTML version deleted]]
More information about the R-help
mailing list