[R] How do I compare 47 GLM models with 1 to 5 interactions and unique combinations?
Frank Harrell
f.harrell at vanderbilt.edu
Fri Jan 27 19:41:24 CET 2012
Ruben you are mistaken on every single point. But I see it's not worth
continuing this discussion.
Frank
Rubén Roa wrote
>
> -----Mensaje original-----
> De: r-help-bounces@ [mailto:r-help-bounces@] En nombre de Frank Harrell
> Enviado el: viernes, 27 de enero de 2012 14:28
> Para: r-help@
> Asunto: Re: [R] How do I compare 47 GLM models with 1 to 5 interactions
> and unique combinations?
>
> Ruben, I'm not sure you are understanding the ramifications of what Bert
> said. In addition you are making several assumptions implicitly:
>
> --
> Ruben: Frank, I guess we are going nowhere now.
> But thanks anyways. See below if you want.
>
> 1. model selection is needed (vs. fitting the full model and using
> shrinkage)
> Ruben: Nonlinear mechanistic models that are too complex often just don't
> converge, they crash. No shrinkage to apply to a failed convergence model.
>
> 2. model selection works in the absence of shrinkage
> Ruben: I think you are assuming that shrinkage is necessary.
>
> 3. model selection can find the "right" model and the features selected
> would be the same features selected if the data were slightly perturbed or
> a new sample taken
> Ruben: I don't make this assumption. New data, new model.
>
> 4. AIC tells you something that P-values don't (unless using structured
> multiple degree of freedom tests)
> Ruben: It does.
>
> 5. parsimony is good
> Ruben: It is.
>
> None of these assumptions is true. Model selection without shrinkage
> (penalization) seems to offer benefits but this is largely a mirage.
>
> Ruben: Have a good weekend!
>
> Ruben
>
> Rubén Roa wrote
>>
>> -----Mensaje original-----
>> De: Bert Gunter [mailto:gunter.berton@] Enviado el: jueves, 26 de
>> enero de 2012 21:20
>> Para: Rubén Roa
>> CC: Ben Bolker; Frank Harrell
>> Asunto: Re: [R] How do I compare 47 GLM models with 1 to 5
>> interactions and unique combinations?
>>
>> On Wed, Jan 25, 2012 at 11:39 PM, Rubén Roa <rroa@> wrote:
>>> I think we have gone through this before.
>>> First, the destruction of all aspects of statistical inference is not
>>> at stake, Frank Harrell's post notwithstanding.
>>> Second, checking all pairs is a way to see for _all pairs_ which
>>> model outcompetes which in terms of predictive ability by -2AIC or
>>> more. Just sorting them by the AIC does not give you that if no model
>>> is better than the next best by less than 2AIC.
>>> Third, I was not implying that AIC differences play the role of
>>> significance tests. I agree with you that model selection is better
>>> not understood as a proxy or as a relative of significance tests
>>> procedures.
>>> Incidentally, when comparing many models the AIC is often inconclusive.
>>> If one is bent on selecting just _the model_ then I check numerical
>>> optimization diagnostics such as size of gradients, KKT criteria, and
>>> other issues such as standard errors of parameter estimates and the
>>> correlation matrix of parameter estimates.
>>
>> -- And the mathematical basis for this claim is .... ??
>>
>> --
>> Ruben: In my area of work (building/testing/applying mechanistic
>> nonlinear models of natural and economic phenomena) the issue of
>> numerical optimization is a very serious one. It is often the case
>> that a really good looking model does not converge properly (that's
>> why ADMB is so popular among us). So if the AIC is inconclusive, but
>> one AIC-tied model yields reasonably looking standard errors and low
>> pairwise parameter estimates correlations, while the other wasn´t even
>> able to produce a positive definite Hessian matrix (though it was able
>> to maximize the log-likelihood), I think I have good reasons to select
>> the properly converged model. I assume here that the lack of
>> convergence is a symptom of model inadequacy to the data, that the AIC
>> was not able to detect.
>> ---
>> Ruben: For some reasons I don't find model averaging appealing. I
>> guess deep in my heart I expect more from my model than just the best
>> predictive ability.
>>
>> -- This is a religious, not a scientific statement, and has no place
>> in any scientific discussion.
>>
>> --
>> Ruben: Seriously, there is a wide and open place in scientific
>> discussion for mechanistic model-building. I expect the models I built
>> to be more than able predictors, I want them to capture some aspect of
>> nature, to teach me something about nature, so I refrain from model
>> averaging, which is an open admission that you don't care too much
>> about what's really going on.
>>
>> -- The belief that certain data analysis practices -- standard or not
>> -- somehow can be trusted to yield reliable scientific results in the
>> face of clear theoretical (mathematical )and practical results to the
>> contrary, while widespread, impedes and often thwarts the progress of
>> science, Evidence-based medicine and clinical trials came about for a
>> reason. I would encourage you to reexamine the basis of your
>> scientific practice and the role that "magical thinking" plays in it.
>>
>> Best,
>> Bert
>>
>> --
>> Ruben: All right Bert. I often re-examine the basis of my scientific
>> praxis but less often than I did before, I have to confess. I like to
>> think it is because I am converging on the right praxis so there are
>> less issues to re-examine. But this problem of model selection is a tough
>> one.
>> Being a likelihoodist in inference naturally leads you to AIC-based
>> model selection, Jim Lindsey being influent too. Wanting that your
>> models say some something about nature is another strong conditioning
>> factor. Put this two together and it becomes hard to do
>> model-averaging. And it has nothing to do with religion (yuck!).
>>
>> ______________________________________________
>> R-help@ mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
> -----
> Frank Harrell
> Department of Biostatistics, Vanderbilt University
> --
> View this message in context:
> http://r.789695.n4.nabble.com/How-do-I-compare-47-GLM-models-with-1-to-5-interactions-and-unique-combinations-tp4326407p4333464.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help@ mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> ______________________________________________
> R-help@ mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
-----
Frank Harrell
Department of Biostatistics, Vanderbilt University
--
View this message in context: http://r.789695.n4.nabble.com/How-do-I-compare-47-GLM-models-with-1-to-5-interactions-and-unique-combinations-tp4326407p4334353.html
Sent from the R help mailing list archive at Nabble.com.
More information about the R-help
mailing list