# [R-sig-ME] z-scores and glht

John Maindonald john.maindonald at anu.edu.au
Wed Apr 25 23:43:08 CEST 2018

```Working out the appropriate degrees of freedom is worse than difficult, surely.
As the variance estimate is, under the model’s normality assumptions, a linear
combination of chi-squared statistics, the distribution is not a t-distribution.
Approximations are available that provide degrees of freedom for a t-distribution
approximation that, for calculating percentage points that are commonly of
interest, will commonly do the job acceptably well.  The Kenward-Roger
approximation, implemented in the `afex` package, seemed for long time to be
the best of the bunch — has more recent work may come up with anything better?

An updated Chapter 10 for the third edition of the Maindonald & Braun text
'Data Analysis and Graphics Using R - An Example-Based Approach’ has been
posted at:
http://maths-people.anu.edu.au/%7Ejohnm/daagur4/ch10-4ednDraft.pdf<http://maths-people.anu.edu.au/~johnm/daagur4/ch10-4ednDraft.pdf>
(this 4th edition ‘draft’ may or may not make it into print — progress on a 4th
has now for some months been stalled at the publisher end of the chain.)

There is an example on page 9 of the pdf (labeled p. 340) that demonstrates
the use of afex::mixed() called with "method=‘KR’”, to invoke the use of the
Kenward-Roger approximation.

Note also the possibility of using the function lme4::botMer() to obtain simulated
estimates or (with `use.u = TRUE` and type=="semiparametric") a
simulated/bootstrapped mix.

John Maindonald

On 26/04/2018, at 06:53, Ben Bolker <bbolker at gmail.com<mailto:bbolker at gmail.com>> wrote:

A little more detail:

if we take the ratio  R=(estimated coefficient)/(standard error), that
is not yet either a "Z score" or a "t score".  If we assume the standard
error is itself estimated without error (i.e. we have an arbitrarily
large amount of data), then we expect R to be normally distributed and
we call it a "Z-score".  If we take into account the expected
uncertainty in the standard error, which in simple cases we can quantify
by knowing the number of residual degrees of freedom, we expect R to be
t-distributed with df=(residual degrees of freedom); then we call R a
"t-score".

If we are not in a simple case, figuring out the appropriate df can be
difficult.

cheers
Ben Bolker

On 2018-04-25 02:49 PM, Cristiano Alessandro wrote:
Hi Dan,

thanks for your answer. Sorry about my naive question, from a
non-statistician. I still have trouble understanding; you say that z-scores
are the estimates divided by the SE. Isn't this the definition of a
t-statistic under the null hypothesis that the mean is equal to zero?

Also, when you say that glht() is side-stepping all of that and just using
a normal approximation. What does it mean/imply exactly, as far as
computing the z-scores (the ones I see in the output of the summary) goes?

Best
Cristiano

On Wed, Apr 25, 2018 at 1:25 PM, Dan Mirman <dan at danmirman.org<mailto:dan at danmirman.org>> wrote:

The z-scores are computed by dividing the Estimate by the SE. As for why
these are not t-statistics, the short answer is that the degrees of freedom
are not trivial to compute. I believe Doug Bates' response is often cited
by way of explanation:
http://stat.ethz.ch/pipermail/r-help/2006-May/094765.html and it is
covered
in the FAQ:
http://bbolker.github.io/mixedmodels-misc/glmmFAQ.html#
why-doesnt-lme4-display-denominator-degrees-of-freedomp-values-what-other-
options-do-i-have
(for more discussion of alternatives see Luke, 2017,

glht() is side-stepping all of that and just using a normal approximation.
For what it's worth, my own experience is that this approximation is only
slightly anti-conservative, so I usually feel comfortable using it.

Hope that helps,
Dan

On Wed, Apr 25, 2018 at 12:26 PM, Cristiano Alessandro <
cri.alessandro at gmail.com> wrote:

Hi all,

something is wrong with my email, so I am sorry for possible multiple
postings.

After fitting a model with lme, I run post-hoc tests with glht. The
results
are repored in the following:

lev.ph <- glht(lev.lm, linfct = ph_conditional);

Simultaneous Tests for General Linear Hypotheses

Fit: lme.formula(fixed = data ~ des_days, data = data_red_trf, random =
~des_days |
ratID, method = "ML", na.action = na.omit, control = lCtr)

Linear Hypotheses:
Estimate   Std. Error  z value
Pr(>|z|)
des_days1 == 0     3232.2      443.2         7.294        9.05e-13 ***
des_days14 == 0   3356.1      912.2         3.679        0.000702 ***
des_days48 == 0   2688.4     1078.5        2.493        0.038025 *

I am trying to understand the output values. How are the z-scores
computed?
If the function uses standard errors, should these be t-statistics (and
not
z-scores)?

Thanks for your help, and sorry for the naive question.

Best
Cristiano

[[alternative HTML version deleted]]

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

--
-----------------------------------------------------
Dan Mirman
Associate Professor
Department of Psychology
University of Alabama at Birmingham
http://www.danmirman.org
-----------------------------------------------------

[[alternative HTML version deleted]]

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

[[alternative HTML version deleted]]

_______________________________________________
R-sig-mixed-models at r-project.org<mailto:R-sig-mixed-models at r-project.org> mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

_______________________________________________
R-sig-mixed-models at r-project.org<mailto:R-sig-mixed-models at r-project.org> mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

[[alternative HTML version deleted]]

```