[R-sig-ME] z-scores and glht
John Maindonald
john.maindonald at anu.edu.au
Wed Apr 25 23:43:08 CEST 2018
Working out the appropriate degrees of freedom is worse than difficult, surely.
As the variance estimate is, under the model’s normality assumptions, a linear
combination of chi-squared statistics, the distribution is not a t-distribution.
Approximations are available that provide degrees of freedom for a t-distribution
approximation that, for calculating percentage points that are commonly of
interest, will commonly do the job acceptably well. The Kenward-Roger
approximation, implemented in the `afex` package, seemed for long time to be
the best of the bunch — has more recent work may come up with anything better?
An updated Chapter 10 for the third edition of the Maindonald & Braun text
'Data Analysis and Graphics Using R - An Example-Based Approach’ has been
posted at:
http://maths-people.anu.edu.au/%7Ejohnm/daagur4/ch10-4ednDraft.pdf<http://maths-people.anu.edu.au/~johnm/daagur4/ch10-4ednDraft.pdf>
(this 4th edition ‘draft’ may or may not make it into print — progress on a 4th
has now for some months been stalled at the publisher end of the chain.)
There is an example on page 9 of the pdf (labeled p. 340) that demonstrates
the use of afex::mixed() called with "method=‘KR’”, to invoke the use of the
Kenward-Roger approximation.
Note also the possibility of using the function lme4::botMer() to obtain simulated
estimates or (with `use.u = TRUE` and type=="semiparametric") a
simulated/bootstrapped mix.
John Maindonald
On 26/04/2018, at 06:53, Ben Bolker <bbolker at gmail.com<mailto:bbolker at gmail.com>> wrote:
A little more detail:
if we take the ratio R=(estimated coefficient)/(standard error), that
is not yet either a "Z score" or a "t score". If we assume the standard
error is itself estimated without error (i.e. we have an arbitrarily
large amount of data), then we expect R to be normally distributed and
we call it a "Z-score". If we take into account the expected
uncertainty in the standard error, which in simple cases we can quantify
by knowing the number of residual degrees of freedom, we expect R to be
t-distributed with df=(residual degrees of freedom); then we call R a
"t-score".
If we are not in a simple case, figuring out the appropriate df can be
difficult.
cheers
Ben Bolker
On 2018-04-25 02:49 PM, Cristiano Alessandro wrote:
Hi Dan,
thanks for your answer. Sorry about my naive question, from a
non-statistician. I still have trouble understanding; you say that z-scores
are the estimates divided by the SE. Isn't this the definition of a
t-statistic under the null hypothesis that the mean is equal to zero?
Also, when you say that glht() is side-stepping all of that and just using
a normal approximation. What does it mean/imply exactly, as far as
computing the z-scores (the ones I see in the output of the summary) goes?
Best
Cristiano
On Wed, Apr 25, 2018 at 1:25 PM, Dan Mirman <dan at danmirman.org<mailto:dan at danmirman.org>> wrote:
The z-scores are computed by dividing the Estimate by the SE. As for why
these are not t-statistics, the short answer is that the degrees of freedom
are not trivial to compute. I believe Doug Bates' response is often cited
by way of explanation:
http://stat.ethz.ch/pipermail/r-help/2006-May/094765.html and it is
covered
in the FAQ:
http://bbolker.github.io/mixedmodels-misc/glmmFAQ.html#
why-doesnt-lme4-display-denominator-degrees-of-freedomp-values-what-other-
options-do-i-have
(for more discussion of alternatives see Luke, 2017,
http://link.springer.com/article/10.3758%2Fs13428-016-0809-y).
glht() is side-stepping all of that and just using a normal approximation.
For what it's worth, my own experience is that this approximation is only
slightly anti-conservative, so I usually feel comfortable using it.
Hope that helps,
Dan
On Wed, Apr 25, 2018 at 12:26 PM, Cristiano Alessandro <
cri.alessandro at gmail.com> wrote:
Hi all,
something is wrong with my email, so I am sorry for possible multiple
postings.
After fitting a model with lme, I run post-hoc tests with glht. The
results
are repored in the following:
lev.ph <- glht(lev.lm, linfct = ph_conditional);
summary(lev.ph, test=adjusted("bonferroni"))
Simultaneous Tests for General Linear Hypotheses
Fit: lme.formula(fixed = data ~ des_days, data = data_red_trf, random =
~des_days |
ratID, method = "ML", na.action = na.omit, control = lCtr)
Linear Hypotheses:
Estimate Std. Error z value
Pr(>|z|)
des_days1 == 0 3232.2 443.2 7.294 9.05e-13 ***
des_days14 == 0 3356.1 912.2 3.679 0.000702 ***
des_days48 == 0 2688.4 1078.5 2.493 0.038025 *
I am trying to understand the output values. How are the z-scores
computed?
If the function uses standard errors, should these be t-statistics (and
not
z-scores)?
Thanks for your help, and sorry for the naive question.
Best
Cristiano
[[alternative HTML version deleted]]
_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
--
-----------------------------------------------------
Dan Mirman
Associate Professor
Department of Psychology
University of Alabama at Birmingham
http://www.danmirman.org
-----------------------------------------------------
[[alternative HTML version deleted]]
_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
[[alternative HTML version deleted]]
_______________________________________________
R-sig-mixed-models at r-project.org<mailto:R-sig-mixed-models at r-project.org> mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
_______________________________________________
R-sig-mixed-models at r-project.org<mailto:R-sig-mixed-models at r-project.org> mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
[[alternative HTML version deleted]]
More information about the R-sig-mixed-models
mailing list