[R-sig-ME] lmer and p-values (variable selection)

Ben Bolker bbolker at gmail.com
Tue Mar 29 23:00:25 CEST 2011


On 03/29/2011 04:44 PM, Manuel Spínola wrote:
> Dear Dominick,
> 
> Thank you for your message.
> 
> In my opinion, the relationship of theories (and scientific hypothesis)
> is not so straightforward to hypothesis testing (statistical hypothesis)
> as many people think, but certainly the p-value is not going to help
> much on that relationship.
> 
> If somebody is entertains several possible models why not to:
> 
> Pr(Model | data) instead of Pr(data | H0)?
> 
> Best,
> 
> Manuel

  A couple of points:

  * p-values certainly have their problems, but despite their problems
they answer a need.  Fisher/Neymann/Pearson were pretty smart guys, and
the question that p-values answer ("how likely is it that I would see a
pattern this strong, or stronger, if there were really nothing
happening?") is one that we often want to ask.  It's also nice to have a
concise, general statement of the strength of an effect, even if it has
flaws (arguably we could all be quoting log-likelihood differences, or
standardized regression coefficients, instead).
  * Notice how often the quotes that you posted below say "overuse", or
"undue", or "too much emphasis" (rather than "never" or "forbidden").
Yes, if I had to choose between a p-value and a confidence interval I
would take the confidence interval every time -- but then I have to
decide what kind of confidence interval I want, and if I decide to use
frequentist confidence intervals I am back in the soup again, both with
interpretation and with the difficulties (in the mixed model context) of
computing them appropriately.
  * I wouldn't object if everyone decided to go Bayesian, but that does
have its own cans of worms (deciding on priors, computing [deciding
about convergence if using MCMC], etc.).  Again, if I had to choose
between frequentist *only* or Bayesian *only* I would probably choose
Bayesian. The hybrid-Bayesian approaches (e.g. mcmcsamp, post-estimation
MCMC in AD Model Builder) choose flat priors on the (perhaps arbitrarily
chosen) current scale of the parameters, glossing over details that are
sometimes important.  (The same goes for the pseudo-Bayesian
interpretation of AIC.)

  I agree that the relations among scientific theory and statistical
practices are tough. From Crome 1997:

18.  Use statistical procedures from a range of schools and strictly
adhere to their respective methods and interpretation. For example, do a
Fisherian significance test properly and interpret it properly. Then set
up a formal Neymann-Pearson test and interpret it formally (this means
setting up both Type I and II error rates beforehand, among other
things). Then do an estimation procedure. Then switch hats and do a
Bayesian analysis. Take the results of all four, noting their different
behavior, and come to your conclusion. Good analysis and interpretation
are as important as the fieldwork, so allot adequate time and resources
to both.  ....

Crome, Francis H. J. 1997. Researching tropical forest fragmentation:
Shall we keep on doing what we’re doing? In Tropical forest remnants:
ecology, management, and conservation of fragmented communities, ed. W.
F Laurance and R. O Bierregard, 485-501. Chicago, IL: University of
Chicago Press.

  (There is more here that's worth reading.)




> 
> 
> On 29/03/2011 08:51 a.m., Dominick Samperi wrote:
>> On Tue, Mar 29, 2011 at 8:45 AM, Manuel Spínola <mspinola10 at gmail.com> wrote:
>>> Thank you very much Ben.   Yes, that answer my question.
>>>
>>> I didn't have a bad intention but for many non-statisticians I think is
>>> confusing why there still so much emphasis on p-values.
>>> I know that this will be controversial and I don't have the background
>>> to discuss with a statistician but I am confused with the use of p-value
>>> in many instances by statisticians.  Many well known statisticians have
>>> been very critical on the use of p-value as usually are used in
>>> statistics.  Here there is a link with a list of quotes of many well
>>> known statisticians against null hypothesis significance testing
>>> (http://warnercnr.colostate.edu/~anderson/nester.html ).
>> This topic (and this web page) has been discussed at length on
>> this list recently. Check out the archives.
>>
>> I like to think of p-values and hypothesis testing as a more scientific
>> variant of trial by jury, where the theory to be proved ("as charged")
>> is found guilty by establishing that inconsistent theories (null hypotheses)
>> are unlikely to be true given the observed data. If the null hypothesis
>> is true ("beyond a reasonable doubt"), then the theory to be tested "could
>> not have been at the scene of the crime." Note that just as in a jury
>> trial, this does not prove that the theory in question is true with
>> absolute certainty.
>>
>> In practice one usually entertains several possible models or theories
>> and selects the one that seems to explain the data best by eliminating
>> most of the variance in the observations. More precisely, a good model
>> is one where the residual is negligible and looks like "noise."
>>
>> Dominick
>>
>>> Some of the quotes:
>>>
>>> Yates - "the emphasis given to formal tests of significance ... has
>>> resulted in ... an undue concentration of effort by mathematical
>>> statisticians on investigations of tests of significance applicable to
>>> problems which are of little or no practical importance ... and ... it
>>> has caused scientific research workers to pay undue attention to the
>>> results of the tests of significance ... and too little to the estimates
>>> of the magnitude of the effects they are investigating"
>>>
>>> Cochran and Cox - "In many experiments it seems obvious that the
>>> different treatments must have produced some difference, however small,
>>> in effect. Thus the hypothesis that there is no difference is
>>> unrealistic: the real problem is to obtain estimates of the sizes of the
>>> differences."
>>>
>>> Savage - "Null hypotheses of no difference are usually known to be false
>>> before the data are collected ... when they are, their rejection or
>>> acceptance simply reflects the size of the sample and the power of the
>>> test, and is not a contribution to science".
>>>
>>> Kish - "Significance should stand for meaning and refer to substantive
>>> matter. ... I would recommend that statisticians discard the phrase
>>> 'test of significance ".
>>>
>>> Kish - "the tests of null hypotheses of zero differences, of no
>>> relationships, are frequently weak, perhaps trivial statements of the
>>> researcher's aims ... in many cases, instead of the tests of
>>> significance it would be more to the point to measure the magnitudes of
>>> the relationships, attaching proper statements of their sampling
>>> variation. The magnitudes of relationships cannot be measured in terms
>>> of levels of significance".
>>>
>>> Nunnally - "the null-hypothesis models ... share a crippling flaw: in
>>> the real world the null hypothesis is almost never true, and it is
>>> usually nonsensical to perform an experiment with the sole aim of
>>> rejecting the null hypothesis" .
>>>
>>> Nunnally - "If rejection of the null hypothesis were the real intention
>>> in psychological experiments, there usually would be no need to gather
>>> data".
>>>
>>> Yates - "The most commonly occurring weakness ... is ... undue emphasis
>>> on tests of significance, and failure to recognise that in many types of
>>> experimental work estimates of treatment effects, together with
>>> estimates of the errors to which they are subject, are the quantities of
>>> primary interest".
>>>
>>> Yates - "In many experiments ... it is known that the null hypothesis
>>> ... is certainly untrue".
>>>
>>> Cox - "Overemphasis on tests of significance at the expense especially
>>> of interval estimation has long been condemned".
>>>
>>> Kruskal - "it is easy to ... throw out an interesting baby with the
>>> nonsignificant bath water. Lack of statistical significance at a
>>> conventional level does not mean that no real effect is present; it
>>> means only that no real effect is clearly seen from the data. That is
>>> why it is of the highest importance to look at power and to compute
>>> confidence intervals"
>>>
>>> Kruskal - "Because of the relative simplicity of its structure,
>>> significance testing has been overemphasized in some presentations of
>>> statistics, and as a result some students come mistakenly to feel that
>>> statistics is little else than significance testing"
>>>
>>> Best,
>>>
>>> Manuel
>>>
>>> On 29/03/2011 06:14 a.m., Ben Bolker wrote:
> On 11-03-29 07:35 AM, Manuel Spínola wrote:
>>>>>> I am not a statistician, but what the p-value is telling me?
>>>>>>
>>>>>> Is not more important the effect size?
>>>>>>
>>>>>> Best,
>>>>>>
>>>>>> Manuel
>>>>>>
>    Hmm.  What's the motivation for your question?
> 
>    The p-value gives you the probability of the observed pattern, or a
> more extreme one, having occurred if the null hypothesis were true.
>    The effect size (defined in various ways) tells you something about
> the strength of the observed pattern.
>     Statistical and subject-area (in your case, biological) significance
> are complementary. A highly statistically significant but biologically
> trivial effect is a curiosity; a biologically important but
> statistically insignificant effect means you need more/better data.
> 
>    I don't know if that answers your question.
> 
>>>>>> On 28/03/2011 04:40 p.m., Ben Bolker wrote:
>>>>>>> On 03/28/2011 06:15 PM, John Maindonald wrote:
>>>>>>>
>>>>>>>> Elimination of a term with a p-value greater than say 0.15 or 0.2 is
>>>>>>>> however likely to make little differences to estimates of other terms
>>>>>>>> in the model.  Thus, it may be a reasonable way to proceed.  For
>>>>>>>> this purpose, an anti-conservative (smaller than it should be)
>>>>>>>> p-value will usually serve the purpose.
>>>>>>>    Note that naive likelihood ratio tests of random effects are likely to
>>>>>>> be conservative (in the simplest case, true p-values are twice the
>>>>>>> nominal value) because of boundary issues and those of fixed effects are
>>>>>>> probably anticonservative because of finite-size effects (see PB 2000
>>>>>>> for examples of both cases.)
>>>>>>>
>>>>>>>> John Maindonald             email: john.maindonald at anu.edu.au
>>>>>>>> phone : +61 2 (6125)3473    fax  : +61 2(6125)5549
>>>>>>>> Centre for Mathematics&  Its Applications, Room 1194,
>>>>>>>> John Dedman Mathematical Sciences Building (Building 27)
>>>>>>>> Australian National University, Canberra ACT 0200.
>>>>>>>> http://www.maths.anu.edu.au/~johnm
>>>>>>>>
>>>>>>>    Ben
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> R-sig-mixed-models at r-project.org mailing list
>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>>>>>>
>>>>>>>
>>>>>> --
>>>>>> *Manuel Spínola, Ph.D.*
>>>>>> Instituto Internacional en Conservación y Manejo de Vida Silvestre
>>>>>> Universidad Nacional
>>>>>> Apartado 1350-3000
>>>>>> Heredia
>>>>>> COSTA RICA
>>>>>> mspinola at una.ac.cr
>>>>>> mspinola10 at gmail.com
>>>>>> Teléfono: (506) 2277-3598
>>>>>> Fax: (506) 2237-7036
>>>>>> Personal website: Lobito de río
>>>>>> <https://sites.google.com/site/lobitoderio/>
>>>>>> Institutional website: ICOMVIS<http://www.icomvis.una.ac.cr/>
>>>>
>>>
>>> --
>>> *Manuel Spínola, Ph.D.*
>>> Instituto Internacional en Conservación y Manejo de Vida Silvestre
>>> Universidad Nacional
>>> Apartado 1350-3000
>>> Heredia
>>> COSTA RICA
>>> mspinola at una.ac.cr
>>> mspinola10 at gmail.com
>>> Teléfono: (506) 2277-3598
>>> Fax: (506) 2237-7036
>>> Personal website: Lobito de río
>>> <https://sites.google.com/site/lobitoderio/>
>>> Institutional website: ICOMVIS <http://www.icomvis.una.ac.cr/>
>>>
>>>        [[alternative HTML version deleted]]
>>>
>>>
>>> _______________________________________________
>>> R-sig-mixed-models at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>>
>>>
>>

> -- 
> *Manuel Spínola, Ph.D.*
> Instituto Internacional en Conservación y Manejo de Vida Silvestre
> Universidad Nacional
> Apartado 1350-3000
> Heredia
> COSTA RICA
> mspinola at una.ac.cr
> mspinola10 at gmail.com
> Teléfono: (506) 2277-3598
> Fax: (506) 2237-7036
> Personal website: Lobito de río
> <https://sites.google.com/site/lobitoderio/>
> Institutional website: ICOMVIS <http://www.icomvis.una.ac.cr/>




More information about the R-sig-mixed-models mailing list