[R-sig-ME] 答复: Non-normal random effect in glmm
thierry.onkelinx at inbo.be
Thu Oct 13 10:36:14 CEST 2016
Have a look at the subjects with high random intercepts. They are likely
subjects with all positive outcomes. The high random intercepts are the
result of complete separation.
I don't bother with calculating the proportion of variance explained in
case of generalised linear models. This is something like R²: a nice and
simple property of _linear_ models due to the Gaussian distribution where
mean and variance are independent. But very hard with distribution where
mean and variance are linked.
ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature and
team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to say
what the experiment died of. ~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data. ~ Roger Brinner
The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of data.
~ John Tukey
2016-10-12 16:30 GMT+02:00 Chen Chun <talischen op hotmail.com>:
> Dear all,
> I am applying a mixed model with binomial distribution on a very large
> data set (around 400000 samples) with binary outcome (very few event,
> around 4%). Some respondents but not all are repeated measured over the
> years, that's why a mixed model is applied. The model can be written as :
> mod <- glmer(response ~ AGE + SEX...+ YEAR + (1 | respondentID),
> family=binomial, data=dat)
> The distribution of the random effect (ID) from the model output shows an
> obvious non-normal distribution: a large proportion of close to zero values
> and very few large values around 10. I am wondering if in this case the
> glmm model is still valid? if not valid, what kind of alternative model
> can I try? Can someone give some suggestion?
> A consequent problem is when I calculate the explained variance from the
> VarF <- var(as.vector(fixef(mod ) %*% t(mod @pp$X)))
> VarF/(VarF + VarCorr(mod )$respondentID + (pi^2)/3)
> the variance of the fixed effect (VarF) from the model is only 1.6, while
> the variance of the random effect (VarCorr(mod )$respondentID) is 149.
> Due to the non-normal distribution, the variance of the random effect is
> very large as compared to the fixed effect. Does this imply that the model
> performs bad? Or I should compute conditional R square?
> To summarize, my questions are:
> 1) What's the influence in estimation of the fixed effect and its
> explained variance (R squared) when the random effect does not follow a
> normal distribution? If the influence is large, any suggestions to solve it?
> 2) In a more general sense, how to comment a model where a large amount of
> variation comes from the random effects?
> [[alternative HTML version deleted]]
> R-sig-mixed-models op r-project.org mailing list
[[alternative HTML version deleted]]
More information about the R-sig-mixed-models