[Rd] default for 'signif.stars'

Thu Mar 28 10:18:10 CET 2019

>>>>> Lenth, Russell V 
>>>>>     on Wed, 27 Mar 2019 00:06:08 +0000 writes:

    > Dear R-Devel, As I am sure many of you know, a special
    > issue of The American Statistician just came out, and its
    > theme is the [mis]use of P values and the many common ways
    > in which they are abused. The lead editorial in that issue
    > mentions the 2014 ASA guidelines on P values, and goes one
    > step further, by now recommending that the words
    > "statistically significant" and related simplistic
    > interpretations no longer be used. There is much
    > discussion of the problems with drawing "bright lines"
    > concerning P values.

    > This is the position of a US society, but my sense is that
    > the statistical community worldwide is pretty much on the
    > same page.

    > Meanwhile, functions such as 'print.summary.lm' and
    > 'print.anova' have an argument 'signif.stars' that really
    > does involve drawing bright lines when it is set to
    > TRUE. And the default setting for the "show.signif.stars"
    > option is TRUE. Isn't it time to at least make
    > "show.signif.stars" default to FALSE? And, indeed, to
    > consider deprecating those 'signif.stars' options
    > altogether?

Dear Russ,
Abs has already given good reasons why this article may well be
considered problematic.

However, I think you and (many but not all) others who've raised
this issue before you, slightly miss the following point.

If p-values are misleading they should not be shown (and hence
the signif.stars neither.
That has been the approach adopted e.g., in the lme4 package
*AND* has been an approach originally used in S and I think
parts of R as well, in more places than now, notably, e.g., for
print( summary(<glm>) ).

Fact is that users will write wrappers and their own packages
just to get to p values, even in very doubtful cases...
But anyway that (p values or not) is a different discussion
which has some value.

You however focus on the "significance stars".  I've argued for
years why they are useful, as they are just a simple
visualization of p values, and saving a lot of human time when
there are many (fixed) effects looked at simultaneously.
Why should users have to visually scan 20 or 50 numbers?  In
modern Data analysis they should never have to but rather look
at a visualization of those numbers. ... and that's what
significance stars are, not more, nor less.

Martin