[R-sig-ME] Model building problem?

Andrew Dolman andydolman at gmail.com
Sat Mar 13 23:58:13 CET 2010


Dear Luciano,

If you're going to judge when to stop dropping terms using AIC then
you should probably use AIC to decide which terms to drop rather than
their p-values. This means you have to fit a lot of models but have
you looked at the function step()? Not that this will necessarily get
you a nice answer but it does automate the process.

Having said this, model selection is a mighty can of worms and
stepwise model building has some particularly juicy ones. If you have
time, have a read of Burnham and Anderson, Model Selection and
Multi-Model Inference.

Better than stepwise elimination is to choose a set of sensible
candidate models and fit them at the same time (including the most
basic "null" model, which here looks like the intercept only model).
Compare their AIC values, specifically the difference between the
lowest AIC and all the others, these are delta-AIC values. If several
models all have low and similar d-AIC values (less than say 2) then
you can't really choose between them. Maybe multi-model inference can
help then.


Another further thought. How much collinearity do you have between
your predictors? If they are correlated with each other then stepwise
model selection is always going to struggle.


Andy.




andydolman at gmail.com



On 13 March 2010 23:07, Luciano La Sala <lucianolasala at yahoo.com.ar> wrote:
> Hello everyone,
>
> I am building a model using the “lmer” function. I have IgG (continuous) as my outcome of interest, and the following variables as fixed effects: Egg Breadth (continuous), Egg Length (continuous), EggVolume (continuous), Clutch Size (three levels), and Hatching Order (three levels), plus random intercepts for NestID.
>
> In model selection, terms were eliminated from a maximum model (with random intercept) to achieve a simpler model that retained only the significant main effects and interactions, using the Akaike information criterion.
>
> At each step of model reduction, I look at the p-values of coefficients and decide which variable to eliminate next, re-fit the model and then I compare AIC values to decide whether the new model is a better fit for my data or not.
>
> To my dismay, the best model is the one containing only the random intercept.
>
> Stepwise variable elimination reduces AIC (see output) despite low p-values for the coefficients of the variables dropped! I would think that at least some of my variables (not just the random effect) should improve the model fit. It strikes me as very odd that the model with only random intercepts offers the best fit, being that random effect variances is close to zero (see output).
>
> Q1. Should I stop simplifying my model at Step 2 or 3, where all main effects have p < 0.05?
>
> Q2. However, AIC keeps dropping thereafter -regardless of significant p values of main effects- until no single main effect is left in the model. This baffles me!
>
> Q3. Last but not least… where am I going so wrong here?
>
> Thank you very much for whatever help you may give me!
>
>
> Here goes a summary of the outputs:
>
> FULL MODEL
>
> Linear mixed model fit by REML
>
> Formula: ELISA2~EggBreadth+EggLength+ClutchSize+HatchOrder+ EggVolume+(1|NestID)
>
>    AIC    BIC logLik deviance REMLdev
>  -544.1 -511.6  282.1   -632.2  -564.1
>
> Random effects:
>  Groups   Name        Variance   Std.Dev.
>  NestID   (Intercept) 0.00016440 0.012822
>  Residual             0.00207281 0.045528
>
> Number of obs: 191, groups: NestID, 111
>
> Fixed effects:
>                      Estimate Std. Error t value  Pr(>|t|)
> (Intercept)           3.545249   2.268083   1.563  0.1198
> EggBreadth           -0.066974   0.046930  -1.427  0.1553
> EggLength            -0.017986   0.016281  -1.105  0.2707
> ClutchSizeTwo-eggs    0.009885   0.011652   0.848  0.3974
> ClutchSizeThree-eggs -0.014039   0.011518  -1.219  0.2245
> HatchOrderSecond      0.015605   0.008245   1.893  0.0600
> HatchOrderThird       0.032599   0.011763   2.771  0.0062
> EggVolume             0.019498   0.014616   1.334  0.1839
>
>
>
>
>
>
> BACKWARD 1. Drop Clutch Size
>
> Linear mixed model fit by REML
> Formula: ELISA2~EggBreadth+EggLength+HatchOrder+EggVolume+(1|NestID)
>
>    AIC    BIC logLik deviance REMLdev
>  -556.4 -530.4  286.2   -625.6  -572.4
>
> Random effects:
>  Groups   Name        Variance   Std.Dev.
>  NestID   (Intercept) 0.00017555 0.013250
>  Residual             0.00211661 0.046007
> Number of obs: 191, groups: NestID, 111
>
> Fixed effects:
>                  Estimate Std. Error t value   Pr(>|t|)
> (Intercept)       3.089050   2.281486   1.354   0.1774
> EggBreadth       -0.057337   0.047197  -1.215   0.2260
> EggLength        -0.013941   0.016351  -0.853   0.3950
> HatchOrderSecond  0.014215   0.007875   1.805   0.0727
> HatchOrderThird   0.021879   0.010740   2.037   0.0431
> EggVolume         0.015693   0.014661   1.070   0.2858
>
>
> BACKWARD 2. Drop EggLength
>
> Linear mixed model fit by REML
>
> Formula: ELISA2 ~ EggBreadth + HatchOrder + EggVolume + (1 | NestID)
>
> Formula: ELISA2 ~ EggBreadth + HatchOrder + EggVolume + (1 | NestID)
>    AIC    BIC logLik deviance REMLdev
>  -564.1 -541.3  289.1   -624.8  -578.1
>
> Random effects:
>  Groups   Name        Variance   Std.Dev.
>  NestID   (Intercept) 0.00015766 0.012556
>  Residual             0.00212966 0.046148
>
> Number of obs: 191, groups: NestID, 111
>
> Fixed effects:
>                  Estimate Std. Error t value   Pr(>|t|)
> (Intercept)       1.148186   0.148751   7.719   0.0000
> EggBreadth       -0.017284   0.004517  -3.826   0.0002
> HatchOrderSecond  0.014918   0.007848   1.901   0.0588
> HatchOrderThird   0.022059   0.010734   2.055   0.0413
> EggVolume         0.003230   0.001148   2.813   0.0054
>
>
> BACKWARD 3. Drop EggBreadth
>
> Linear mixed model fit by REML
>
> Formula: ELISA2 ~ EggLength + HatchOrder + EggVolume + (1 | NestID)
>
>    AIC    BIC logLik deviance REMLdev
>  -561.2 -538.5  287.6   -624.1  -575.2
>
> Random effects:
>  Groups   Name        Variance   Std.Dev.
>  NestID   (Intercept) 0.00015423 0.012419
>  Residual             0.00214197 0.046281
> Number of obs: 191, groups: NestID, 111
>
> Fixed effects:
>                   Estimate Std. Error t value   Pr(>|t|)
> (Intercept)       0.3196987  0.0912062   3.505   0.0006
> EggLength         0.0058330  0.0015671   3.722   0.0003
> HatchOrderSecond  0.0149907  0.0078835   1.902   0.0588
> HatchOrderThird   0.0219364  0.0107628   2.038   0.0429
> EggVolume        -0.0020977  0.0007405  -2.833   0.0051
>
>
> BACKWARD 4. Drop HatchOrder
>
> Formula: ELISA2 ~ EggBreadth + EggVolume + (1 | NestID)
>
> Linear mixed model fit by REML
>    AIC    BIC logLik deviance REMLdev
>  -577.4 -561.1  293.7   -618.9  -587.4
>
> Random effects:
>  Groups   Name        Variance   Std.Dev.
>  NestID   (Intercept) 0.00010214 0.010106
>  Residual             0.00222943 0.047217
>
> Number of obs: 191, groups: NestID, 111
>
> Fixed effects:
>             Estimate Std. Error t value   Pr(>|t|)
> (Intercept)  1.084503   0.146243   7.416   0.0000
> EggBreadth  -0.014484   0.004371  -3.314   0.0011
> EggVolume    0.002409   0.001099   2.193   0.0295
>
>
> BACKWARD 5. Drop EggVolume
>
> Formula: ELISA2 ~ EggBreadth + (1 | NestID)
>
> Linear mixed model fit by REML
>    AIC    BIC logLik deviance REMLdev
>  -586.5 -573.5  297.2   -614.1  -594.5
> Random effects:
>  Groups   Name        Variance   Std.Dev.
>  NestID   (Intercept) 0.00017172 0.013104
>  Residual             0.00221031 0.047014
> Number of obs: 191, groups: NestID, 111
>
> Fixed effects:
>             Estimate Std. Error t value   Pr(>|t|)
> (Intercept)  0.884482   0.115833   7.636   0.0000
> EggBreadth  -0.006443   0.002401  -2.683   0.0079
>
>
> BACKWARD 6. Drop Egg Breadth
>
> Formula: ELISA2 ~ 1 + (1|NestID)
>
> Linear mixed model fit by REML
>    AIC    BIC logLik deviance REMLdev
>  -591.6 -581.8  298.8     -607  -597.6
> Random effects:
>  Groups   Name        Variance  Std.Dev.
>  NestID   (Intercept) 0.0001917 0.013846
>  Residual             0.0022692 0.047636
>
> Number of obs: 191, groups: NestID, 111
>
> Fixed effects:
>            Estimate Std. Error t value   Pr(>|t|)
> (Intercept) 0.573809   0.003727   153.9   0
>
>
>
>
>
>
>      Yahoo! Cocina
>
> Encontra las mejores recetas con Yahoo! Cocina.
>
>
> http://ar.mujer.yahoo.com/cocina/
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>




More information about the R-sig-mixed-models mailing list