[R-sig-ME] 答复: Non-normal random effect in glmm

Thu Oct 13 10:36:14 CEST 2016

Dear Chun,

Have a look at the subjects with high random intercepts. They are likely
subjects with all positive outcomes. The high random intercepts are the
result of complete separation.

I don't bother with calculating the proportion of variance explained in
case of generalised linear models. This is something like R²: a nice and
simple property of _linear_ models due to the Gaussian distribution where
mean and variance are independent. But very hard with distribution where
mean and variance are linked.

Best regards,

ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature and
Forest
team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
Kliniekstraat 25
1070 Anderlecht
Belgium

To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to say
what the experiment died of. ~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data. ~ Roger Brinner
The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of data.
~ John Tukey

2016-10-12 16:30 GMT+02:00 Chen Chun <talischen op hotmail.com>:

> Dear all,
>
>
> I am applying a mixed model with binomial distribution on a very large
> data set (around 400000 samples) with binary outcome (very few event,
> around 4%).  Some respondents but not all are repeated measured over the
> years, that's why a mixed model is applied. The model can be written as :
>
> mod <- glmer(response ~ AGE + SEX...+ YEAR + (1 | respondentID),
> family=binomial, data=dat)
>
> The distribution of the random effect (ID) from the model output shows an
> obvious non-normal distribution: a large proportion of close to zero values
> and very few large values around 10. I am wondering if in this case the
> glmm model is still valid?  if not valid, what kind of alternative model
> can I try? Can someone give some suggestion?
>
> A consequent problem is when I calculate the explained variance from the
> model:
> VarF <- var(as.vector(fixef(mod ) %*% t(mod @pp$X)))
> VarF/(VarF + VarCorr(mod )$respondentID[1] + (pi^2)/3)
>
> the variance of the fixed effect (VarF) from the model is only 1.6, while
> the variance of the random effect (VarCorr(mod )$respondentID[1]) is 149.
> Due to the non-normal distribution, the variance of the random effect is
> very large as compared to the fixed effect. Does this imply that the model
> performs bad? Or I should compute conditional R square?
>
> To summarize, my questions are:
>
> 1) What's the influence in estimation of the fixed effect and its
> explained variance (R squared) when the random effect does not follow a
> normal distribution? If the influence is large, any suggestions to solve it?
>
> 2) In a more general sense, how to comment a model where a large amount of
> variation comes from the random effects?
>
> Thanks
>
> Regards,
> Chun
>
>
>
>         [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-mixed-models op r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>

	[[alternative HTML version deleted]]