[R] Question on approximations of full logistic regression model

Mon May 16 16:19:39 CEST 2011

Thank you for your comment, Prof. Harrell.
I would appreciate it very much if you could teach me how to simulate 
for the estimation. For reference, following codes are what I did 
(bootcov, summary, and validation).

MyFullModel.boot <- bootcov(MyFullModel, B=1000, coef.reps=T)

 > summary(MyFullModel, stenosis=c(70, 80), x1=c(1.5, 2.0), x2=c(1.5, 2.0))
              Effects              Response : outcome

  Factor              Low  High Diff. Effect S.E. Lower 0.95 Upper 0.95
  stenosis            70.0 80   10.0  -0.11  0.24 -0.59      0.37
   Odds Ratio         70.0 80   10.0   0.90    NA  0.56      1.45
  x1                   1.5  2    0.5   1.21  0.37  0.49      1.94
   Odds Ratio          1.5  2    0.5   3.36    NA  1.63      6.95
  x2                   1.5  2    0.5  -0.29  0.19 -0.65      0.08
   Odds Ratio          1.5  2    0.5   0.75    NA  0.52      1.08
  ClinicalScore        3.0  5    2.0   0.61  0.38 -0.14      1.36
   Odds Ratio          3.0  5    2.0   1.84    NA  0.87      3.89
  procedure - CA:CE    2.0  1     NA   0.83  0.46 -0.07      1.72
   Odds Ratio          2.0  1     NA   2.28    NA  0.93      5.59

 > summary(MyFullModel.boot, stenosis=c(70, 80), x1=c(1.5, 2.0), 
x2=c(1.5, 2.0))
              Effects              Response : outcome

  Factor              Low  High Diff. Effect S.E. Lower 0.95 Upper 0.95
  stenosis            70.0 80   10.0  -0.11  0.28 -0.65      0.43
   Odds Ratio         70.0 80   10.0   0.90    NA  0.52      1.54
  x1                   1.5  2    0.5   1.21  0.29  0.65      1.77
   Odds Ratio          1.5  2    0.5   3.36    NA  1.92      5.89
  x2                   1.5  2    0.5  -0.29  0.16 -0.59      0.02
   Odds Ratio          1.5  2    0.5   0.75    NA  0.55      1.02
  ClinicalScore        3.0  5    2.0   0.61  0.45 -0.28      1.50
   Odds Ratio          3.0  5    2.0   1.84    NA  0.76      4.47
  procedure - CAS:CEA  2.0  1     NA   0.83  0.38  0.07      1.58
   Odds Ratio          2.0  1     NA   2.28    NA  1.08      4.85

 > validate(MyFullModel, bw=F, B=1000)
           index.orig training    test optimism index.corrected    n
Dxy           0.6425   0.7054  0.6122   0.0932          0.5493 1000
R2            0.3270   0.3745  0.3330   0.0415          0.2855 1000
Intercept     0.0000   0.0000  0.0683  -0.0683          0.0683 1000
Slope         1.0000   1.0000  1.0465  -0.0465          1.0465 1000
Emax          0.0000   0.0000  0.0221   0.0221          0.0221 1000
D             0.2715   0.2795  0.2424   0.0371          0.2345 1000
U            -0.0192  -0.0192 -0.0035  -0.0157         -0.0035 1000
Q             0.2908   0.2987  0.2460   0.0528          0.2380 1000
B             0.1265   0.1164  0.1332  -0.0168          0.1433 1000
g             1.3366   1.5041  1.5495  -0.0455          1.3821 1000
gp            0.2082   0.2172  0.2258  -0.0087          0.2169 1000

 > validate(MyFullModel.boot, bw=F, B=1000)
           index.orig training    test optimism index.corrected    n
Dxy           0.6425   0.7015  0.6139   0.0877          0.5549 1000
R2            0.3270   0.3738  0.3346   0.0392          0.2878 1000
Intercept     0.0000   0.0000  0.0613  -0.0613          0.0613 1000
Slope         1.0000   1.0000  1.0569  -0.0569          1.0569 1000
Emax          0.0000   0.0000  0.0226   0.0226          0.0226 1000
D             0.2715   0.2805  0.2438   0.0367          0.2348 1000
U            -0.0192  -0.0192 -0.0039  -0.0153         -0.0039 1000
Q             0.2908   0.2997  0.2477   0.0521          0.2387 1000
B             0.1265   0.1177  0.1329  -0.0153          0.1417 1000
g             1.3366   1.5020  1.5523  -0.0503          1.3869 1000
gp            0.2082   0.2191  0.2263  -0.0072          0.2154 1000

(11/05/16 22:01), Frank Harrell wrote:
> The choice is not clear, and requires some simulations to estimate the
> average absolute error of the covariance matrix estimators.
> Frank
>
>
> 細田弘吉 wrote:
>>
>> Thank you for your reply, Prof. Harrell.
>>
>> I agree with you. Dropping only one variable does not actually help a lot.
>>
>> I have one more question.
>> During analysis of this model I found that the confidence
>> intervals (CIs) of some coefficients provided by bootstrapping (bootcov
>> function in rms package) was narrower than CIs provided by usual
>> variance-covariance matrix and CIs of other coefficients wider.  My data
>> has no cluster structure. I am wondering which CIs are better.
>> I guess bootstrapping one, but is it right?
>>
>> I would appreciate your help in advance.
>> --
>> KH
>>
>>
>>
>> (11/05/16 12:25), Frank Harrell wrote:
>>> I think you are doing this correctly except for one thing.  The
>>> validation
>>> and other inferential calculations should be done on the full model.  Use
>>> the approximate model to get a simpler nomogram but not to get standard
>>> errors.  With only dropping one variable you might consider just running
>>> the
>>> nomogram on the entire model.
>>> Frank
>>>
>>>
>>> KH wrote:
>>>>
>>>> Hi,
>>>> I am trying to construct a logistic regression model from my data (104
>>>> patients and 25 events). I build a full model consisting of five
>>>> predictors with the use of penalization by rms package (lrm, pentrace
>>>> etc) because of events per variable issue. Then, I tried to approximate
>>>> the full model by step-down technique predicting L from all of the
>>>> componet variables using ordinary least squares (ols in rms package) as
>>>> the followings. I would like to know whether I am doing right or not.
>>>>
>>>>> library(rms)
>>>>> plogit<- predict(full.model)
>>>>> full.ols<- ols(plogit ~ stenosis+x1+x2+ClinicalScore+procedure,
>>>>> sigma=1)
>>>>> fastbw(full.ols, aics=1e10)
>>>>
>>>>    Deleted       Chi-Sq d.f. P      Residual d.f. P      AIC    R2
>>>>    stenosis       1.41  1    0.2354   1.41   1    0.2354  -0.59 0.991
>>>>    x2            16.78  1    0.0000  18.19   2    0.0001  14.19 0.882
>>>>    procedure     26.12  1    0.0000  44.31   3    0.0000  38.31 0.711
>>>>    ClinicalScore 25.75  1    0.0000  70.06   4    0.0000  62.06 0.544
>>>>    x1            83.42  1    0.0000 153.49   5    0.0000 143.49 0.000
>>>>
>>>> Then, fitted an approximation to the full model using most imprtant
>>>> variable (R^2 for predictions from the reduced model against the
>>>> original Y drops below 0.95), that is, dropping "stenosis".
>>>>
>>>>> full.ols.approx<- ols(plogit ~ x1+x2+ClinicalScore+procedure)
>>>>> full.ols.approx$stats
>>>>             n  Model L.R.        d.f.          R2           g       Sigma
>>>> 104.0000000 487.9006640   4.0000000   0.9908257   1.3341718   0.1192622
>>>>
>>>> This approximate model had R^2 against the full model of 0.99.
>>>> Therefore, I updated the original full logistic model dropping
>>>> "stenosis" as predictor.
>>>>
>>>>> full.approx.lrm<- update(full.model, ~ . -stenosis)
>>>>
>>>>> validate(full.model, bw=F, B=1000)
>>>>             index.orig training    test optimism index.corrected    n
>>>> Dxy           0.6425   0.7017  0.6131   0.0887          0.5539 1000
>>>> R2            0.3270   0.3716  0.3335   0.0382          0.2888 1000
>>>> Intercept     0.0000   0.0000  0.0821  -0.0821          0.0821 1000
>>>> Slope         1.0000   1.0000  1.0548  -0.0548          1.0548 1000
>>>> Emax          0.0000   0.0000  0.0263   0.0263          0.0263 1000
>>>>
>>>>> validate(full.approx.lrm, bw=F, B=1000)
>>>>             index.orig training    test optimism index.corrected    n
>>>> Dxy           0.6446   0.6891  0.6265   0.0626          0.5820 1000
>>>> R2            0.3245   0.3592  0.3428   0.0164          0.3081 1000
>>>> Intercept     0.0000   0.0000  0.1281  -0.1281          0.1281 1000
>>>> Slope         1.0000   1.0000  1.1104  -0.1104          1.1104 1000
>>>> Emax          0.0000   0.0000  0.0444   0.0444          0.0444 1000
>>>>
>>>> Validatin revealed this approximation was not bad.
>>>> Then, I made a nomogram.
>>>>
>>>>> full.approx.lrm.nom<- nomogram(full.approx.lrm,
>>>> fun.at=c(0.05,0.1,0.2,0.4,0.6,0.8,0.9,0.95), fun=plogis)
>>>>> plot(full.approx.lrm.nom)
>>>>
>>>> Another nomogram using ols model,
>>>>
>>>>> full.ols.approx.nom<- nomogram(full.ols.approx,
>>>> fun.at=c(0.05,0.1,0.2,0.4,0.6,0.8,0.9,0.95), fun=plogis)
>>>>> plot(full.ols.approx.nom)
>>>>
>>>> These two nomograms are very similar but a little bit different.
>>>>
>>>> My questions are;
>>>>
>>>> 1. Am I doing right?
>>>>
>>>> 2. Which nomogram is correct
>>>>
>>>> I would appreciate your help in advance.
>>>>
>>>> --
>>>> KH
>>>>
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>
>>>
>>>
>>> -----
>>> Frank Harrell
>>> Department of Biostatistics, Vanderbilt University
>>> --
>>> View this message in context:
>>> http://r.789695.n4.nabble.com/Question-on-approximations-of-full-logistic-regression-model-tp3524294p3525372.html
>>> Sent from the R help mailing list archive at Nabble.com.
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>>       E-mail address
>>           Office: khosoda at med.kobe-u.ac.jp
>> 	Home  : khosoda at venus.dti.ne.jp
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
> -----
> Frank Harrell
> Department of Biostatistics, Vanderbilt University
> --
> View this message in context: http://r.789695.n4.nabble.com/Question-on-approximations-of-full-logistic-regression-model-tp3524294p3526155.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
*************************************************
　神戸大学大学院医学研究科 脳神経外科学分野
　細田 弘吉

　〒650-0017　神戸市中央区楠町7丁目5-1
     Phone: 078-382-5966
     Fax  : 078-382-5979
     E-mail address
         Office: khosoda at med.kobe-u.ac.jp
	Home  : khosoda at venus.dti.ne.jp