[R] Coefficients of Logistic Regression from bootstrap - how to get them?

Gad Abraham gabraham at csse.unimelb.edu.au
Thu Jul 24 02:46:38 CEST 2008


Michal Figurski wrote:
> Thank you all for your words of wisdom.
> 
> I start getting into what you mean by bootstrap. Not surprisingly, it 
> seems to be something else than I do. The bootstrap is a tool, and I 
> would rather compare it to a hammer than to a gun. People say that 
> hammer is for driving nails. This situation is as if I planned to use it 
> to break rocks.
> 
> The key point is that I don't really care about the bias or variance of 
> the mean in the model. These things are useful for statisticians; 
> regular people (like me, also a chemist) do not understand them and have 
> no use for them (well, now I somewhat understand). My goal is very 
> practical: I need an equation that can predict patient's outcome, based 
> on some data, with maximum reliability and accuracy.

My two cents:

Bootstrapping (especially the optimism bootstrap, see Harrell 2001 
``Regression Modeling Strategies'') can be used to estimate how well a 
given model generalises. In other words, to estimate how much your model 
is overfitted to your data (more overfitting => less generalisable model).

This in itself is not useful for getting the coefficients of a good 
model (which is always done through MLE), but it can be used to compare 
different models. As Frank Harrell mentioned, you can do penalised 
regression, and find the best penalty through bootstrapping. This will 
possibly yield a model that is less overfitted and hence more reliable 
in terms of being valid for an unseen sample (from the same population).
Again, see Frank's book for more information about penalisation.

-- 
Gad Abraham
Dept. CSSE and NICTA
The University of Melbourne
Parkville 3010, Victoria, Australia
email: gabraham at csse.unimelb.edu.au
web: http://www.csse.unimelb.edu.au/~gabraham



More information about the R-help mailing list