[R] Question on approximations of full logistic regression model

Mon May 16 06:49:54 CEST 2011

Thank you for your reply, Prof. Harrell.

I agree with you. Dropping only one variable does not actually help a lot.

I have one more question.
During analysis of this model I found that the confidence
intervals (CIs) of some coefficients provided by bootstrapping (bootcov 
function in rms package) was narrower than CIs provided by usual 
variance-covariance matrix and CIs of other coefficients wider.  My data 
has no cluster structure. I am wondering which CIs are better.
I guess bootstrapping one, but is it right?

I would appreciate your help in advance.
--
KH

(11/05/16 12:25), Frank Harrell wrote:
> I think you are doing this correctly except for one thing.  The validation
> and other inferential calculations should be done on the full model.  Use
> the approximate model to get a simpler nomogram but not to get standard
> errors.  With only dropping one variable you might consider just running the
> nomogram on the entire model.
> Frank
>
>
> KH wrote:
>>
>> Hi,
>> I am trying to construct a logistic regression model from my data (104
>> patients and 25 events). I build a full model consisting of five
>> predictors with the use of penalization by rms package (lrm, pentrace
>> etc) because of events per variable issue. Then, I tried to approximate
>> the full model by step-down technique predicting L from all of the
>> componet variables using ordinary least squares (ols in rms package) as
>> the followings. I would like to know whether I am doing right or not.
>>
>>> library(rms)
>>> plogit<- predict(full.model)
>>> full.ols<- ols(plogit ~ stenosis+x1+x2+ClinicalScore+procedure, sigma=1)
>>> fastbw(full.ols, aics=1e10)
>>
>>   Deleted       Chi-Sq d.f. P      Residual d.f. P      AIC    R2
>>   stenosis       1.41  1    0.2354   1.41   1    0.2354  -0.59 0.991
>>   x2            16.78  1    0.0000  18.19   2    0.0001  14.19 0.882
>>   procedure     26.12  1    0.0000  44.31   3    0.0000  38.31 0.711
>>   ClinicalScore 25.75  1    0.0000  70.06   4    0.0000  62.06 0.544
>>   x1            83.42  1    0.0000 153.49   5    0.0000 143.49 0.000
>>
>> Then, fitted an approximation to the full model using most imprtant
>> variable (R^2 for predictions from the reduced model against the
>> original Y drops below 0.95), that is, dropping "stenosis".
>>
>>> full.ols.approx<- ols(plogit ~ x1+x2+ClinicalScore+procedure)
>>> full.ols.approx$stats
>>            n  Model L.R.        d.f.          R2           g       Sigma
>> 104.0000000 487.9006640   4.0000000   0.9908257   1.3341718   0.1192622
>>
>> This approximate model had R^2 against the full model of 0.99.
>> Therefore, I updated the original full logistic model dropping
>> "stenosis" as predictor.
>>
>>> full.approx.lrm<- update(full.model, ~ . -stenosis)
>>
>>> validate(full.model, bw=F, B=1000)
>>            index.orig training    test optimism index.corrected    n
>> Dxy           0.6425   0.7017  0.6131   0.0887          0.5539 1000
>> R2            0.3270   0.3716  0.3335   0.0382          0.2888 1000
>> Intercept     0.0000   0.0000  0.0821  -0.0821          0.0821 1000
>> Slope         1.0000   1.0000  1.0548  -0.0548          1.0548 1000
>> Emax          0.0000   0.0000  0.0263   0.0263          0.0263 1000
>>
>>> validate(full.approx.lrm, bw=F, B=1000)
>>            index.orig training    test optimism index.corrected    n
>> Dxy           0.6446   0.6891  0.6265   0.0626          0.5820 1000
>> R2            0.3245   0.3592  0.3428   0.0164          0.3081 1000
>> Intercept     0.0000   0.0000  0.1281  -0.1281          0.1281 1000
>> Slope         1.0000   1.0000  1.1104  -0.1104          1.1104 1000
>> Emax          0.0000   0.0000  0.0444   0.0444          0.0444 1000
>>
>> Validatin revealed this approximation was not bad.
>> Then, I made a nomogram.
>>
>>> full.approx.lrm.nom<- nomogram(full.approx.lrm,
>> fun.at=c(0.05,0.1,0.2,0.4,0.6,0.8,0.9,0.95), fun=plogis)
>>> plot(full.approx.lrm.nom)
>>
>> Another nomogram using ols model,
>>
>>> full.ols.approx.nom<- nomogram(full.ols.approx,
>> fun.at=c(0.05,0.1,0.2,0.4,0.6,0.8,0.9,0.95), fun=plogis)
>>> plot(full.ols.approx.nom)
>>
>> These two nomograms are very similar but a little bit different.
>>
>> My questions are;
>>
>> 1. Am I doing right?
>>
>> 2. Which nomogram is correct
>>
>> I would appreciate your help in advance.
>>
>> --
>> KH
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
> -----
> Frank Harrell
> Department of Biostatistics, Vanderbilt University
> --
> View this message in context: http://r.789695.n4.nabble.com/Question-on-approximations-of-full-logistic-regression-model-tp3524294p3525372.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

     E-mail address
         Office: khosoda at med.kobe-u.ac.jp
	Home  : khosoda at venus.dti.ne.jp