[R-sig-ME] Can interaction term cause Estimates and Std. Errors to be too large?

ken at kjbeath.com.au ken at kjbeath.com.au
Mon Mar 30 00:51:59 CEST 2009


This is the result of what is known as complete separation, where the
model perfectly predicts the outcome. An aspect of this is the probably
incorrect lack of significance, known as the Hauck-Donner effect. Usually
it is a result of overfitting.

Ken

>
> Dear R-experts,
>
> I am running version 2.7.1 on Windows Vista. I have small dataset which
> consists of:
>
> # NestID: nest indicator for each chicken. Siblings sharing the same nest
> have the same nest indicator.
>
> # Chick: chick indicator consisting of a unique ID for each single chick.
>
> # Year: 2006, 2007.
>
> # ClutchSize: 1-, 2- , 3-eggs.
>
> # HO: hatching order within each clutch (1, 2, 3 [first, second and
> third-hatched chick]).
>
> In order to account for lack of independence at the nest level (many
> chicks are nested in nest...), I'd like to run a GLMM with random slopes
> and intercepts for nests.
>
> My approach to model building was as follows: Variables that had P ≤
> 0.20 on their own in an initial bivariate analysis were forced into the
> multivariable analysis. The general procedure for model selection involved
> starting from a maximum model based on the bivariate analyses and
> eliminating terms to achieve a simpler model that only retained the
> significant main effects and two-way interactions. The model was
> restricted by stepwise manual elimination of variables using the Akaike
> Information Criterion (AIC) as a measure of goodness-of-fit.
> Interactions were tested only between main effects which remained in the
> final model.
>
> My final model for hatching failure (without testing of interaction
> between main effects) is:
>
> model <- lmer(Hatching ~ HatchOrder + Year + (1|NestID), family=binomial,
> 1)
>
> I get the following output:
>
> best.model <- lmer(Hatching~HatchOrder+Year+(1|NestID), family=binomial,
> 1)
>
> Generalized linear mixed model fit by the Laplace approximation
> Formula: Hatching2 ~ HatchingOrder + Year1 + (1 | NestID)
>  Data: 1
>  AIC      BIC         logLik      deviance
>  167.8    185.3       -78.9        157.8
>
> Random effects:
> Groups Name              Variance     Std. Dev.
> NestID (Intercept)       1.9682        1.4029
>
> Number of obs: 247, groups: NestID, 120
>
> Fixed effects:
>                                                                                         Estimate
>
>
> Std.
> Error
>
>
> z
> value
>
>
> Pr(>|z|)
> (Intercept)  -5.4800    0.8329        -6.579   4.73e-11     ***
> HO_Second    1.6344     0.6841         2.389   0.01689      *
> HO_Third     3.3007     0.7162         4.609   4.05e-06     ***
> Year2006     2.1169     0.6741         3.140   0.00169      **
>
> So far, so good… but then I fit the same model incorporating interaction
> between the main effects as follows:
>
> interaction <-lmer(Hatching~HatchOrder+Year+HatchingOrder*Year+(1|NestID),
> family=binomial,1)
>
> And I get the following output:
>
> Data: 1
> AIC       BIC       logLik      deviance
> 157.8     182.3     -71.89      143.8
>
> Random effects:
> Groups Name              Variance         Std. Dev.
> NestID (Intercept)       155.22            12.459
>
> Number of obs: 247, groups: NestID, 120
>
> Fixed effects:
>                                 Estimate Std. Error z value Pr(>|z|)
> (Intercept)                     -13.6158     4.8287 -2.8198  0.00481 **
> HO_Second                       -23.1961 36249.1930 -0.0006  0.99949
> HO_Third                          5.6624     2.6823  2.1110  0.03477 *
> Year2006                         -0.9602     6.1245 -0.1568  0.87541
> HO_Second:Year2006               30.2249 36249.1931  0.0008  0.99933
> HO_Third:Year2006                10.5549     5.2232  2.0208  0.04331 *
>
>
> Correlation of Fixed Effects:
>             (Intr) HtchOS HtchOT Y12006 HOS:Y1
> HtchngOrdrS  0.000
> HtchngOrdrT -0.384  0.000
> Year12006   -0.788  0.000  0.303
> HtOS:Y12006  0.000 -1.000  0.000  0.000
> HtOT:Y12006  0.197  0.000 -0.514 -0.556  0.000
>
> Question 1: I am worried about the overly large values of the Estimate and
> Std. Error for "HO_Second" and "HO_Second*Year2006" from the second model
> (with interaction term included).
> So what may me causing such large values? Should I be concerned? If so,
> how can I solve the problem? Is this an over-fitting problem?
>
> Question 2: The Estimate for "Year2006" becomes negative in the second
> model. Any clue as to why this happens?
>
> Question 3: Should I stick with the simpler model 1 which does not asses
> interaction?
>
> Thank you in advance for the help!
>
> Lucho
>
>
>       Yahoo! Cocina
> Recetas prácticas y comida saludable
> http://ar.mujer.yahoo.com/cocina/
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>




More information about the R-sig-mixed-models mailing list