[R-sig-ME] glmer: order of variables drastically influences SE

Guillaume Adeux gu|||@ume@|mon@@2 @end|ng |rom gm@||@com
Mon Aug 5 10:57:06 CEST 2019


Hello everyone,

Could anyone explain to me how the order of variables in the glmer function
could affect the standard error of the estimates? The difference is drastic
in my case (see the two model outputs at the bottom of the email)...
The data I used comes from a 3 way factorial experiment (tillage,
N=nitrogen, CC=cover crop species) on which an additional covariate was
measured (dry_bio_cover_m2).
My main objective is to identify the drivers of weed biomass in different
cover crop.
I came across the situation doing the following:

*1. fitting full model*
mod_full_CC=glmer(dry_bio_weeds_m2+0.001~block+year+scale(dry_bio_cover_m2)*tillage*N*CC+(1|block:tillage)+(1|block:tillage:N)+(1|block:tillage:N:CC)+(1|block:year)+(1|block:year:tillage)+(1|block:year:tillage:N)+(1|block:year:tillage:N:CC),family=gaussian(link="sqrt"),control=glmerControl(optimizer="nloptwrap",optCtrl=list(algorithm="NLOPT_LN_NELDERMEAD")),data=biomassCC_wo_C)

*2. identifying the most parcimonious model*
dred_CC=dredge(mod_full_CC,rank="AICc",fixed=c("block","year"))

The most parcimonious model resulting from dredge (i.e. get.models(dred_CC,
1)[[1]] ) was:
dry_bio_weeds_m2 + 0.001 ~ *CC + N + scale(dry_bio_cover_m2) +  tillage +
block + year *+
    (1 | block:tillage) + (1 | block:tillage:N) + (1 | block:tillage:N:CC)
+ (1 | block:year) + (1 | block:year:tillage) + (1 | block:year:tillage:N)
+ (1 | block:year:tillage:N:CC) +
   * CC:N + CC:scale(dry_bio_cover_m2) + CC:tillage +
scale(dry_bio_cover_m2):tillage*

If I refit this model in a different (more logical) order with the same
variables, as followed,
dry_bio_weeds_m2 + 0.001 ~* block + year + scale(dry_bio_cover_m2) +
tillage + N + CC + CC:N + CC:tillage + CC:scale(dry_bio_cover_m2) +
tillage:scale(dry_bio_cover_m2)* +
(1 | block:tillage) + (1 | block:tillage:N) + (1 | block:tillage:N:CC) + (1
| block:year) + (1 | block:year:tillage) + (1 | block:year:tillage:N) + (1
|  block:year:tillage:N:CC)
I obtain similar estimates but drastically different SE, which of course
impacts post-hoc analysis.


The condition number of both models is also slightly different (12.1 for
the model with the variables in the order returned by dredge and 14.3 for
the model with the "logical" order), although no severe collinearity was
detected.



Model with variables fitted in the order given by dredge:

Generalized linear mixed model fit by maximum likelihood (Laplace
Approximation) ['glmerMod']
 Family: gaussian  ( sqrt )
Formula: dry_bio_weeds_m2 + 0.001 ~ CC + N + scale(dry_bio_cover_m2) +
     tillage + block + year + (1 | block:tillage) + (1 |
block:tillage:N) +
    (1 | block:tillage:N:CC) + (1 | block:year) + (1 |
block:year:tillage) +      (1 | block:year:tillage:N) + (1 |
block:year:tillage:N:CC) +
    CC:N + CC:scale(dry_bio_cover_m2) + CC:tillage +
scale(dry_bio_cover_m2):tillage
   Data: biomassCC_wo_C
Control: glmerControl(optimizer = "nloptwrap", optCtrl =
list(algorithm = "NLOPT_LN_NELDERMEAD"))

     AIC      BIC   logLik deviance df.resid
  6294.5   6416.4  -3116.2   6232.5      347

Scaled residuals:
    Min      1Q  Median      3Q     Max
-4.7837 -0.5006 -0.0011  0.5075  4.7593

Random effects:
 Groups                  Name        Variance  Std.Dev.
 block:year:tillage:N:CC (Intercept) 3.061e+04 1.750e+02
 block:tillage:N:CC      (Intercept) 2.162e-05 4.650e-03
 block:year:tillage:N    (Intercept) 2.645e-05 5.143e-03
 block:tillage:N         (Intercept) 2.125e+01 4.610e+00
 block:year:tillage      (Intercept) 4.656e-03 6.823e-02
 block:year              (Intercept) 3.848e+02 1.962e+01
 block:tillage           (Intercept) 4.204e-04 2.050e-02
 Residual                            4.505e+03 6.712e+01
Number of obs: 378, groups:
block:year:tillage:N:CC, 192; block:tillage:N:CC, 96;
block:year:tillage:N, 64; block:tillage:N, 32; block:year:tillage, 16;
block:year, 8; block:tillage, 8

Fixed effects:
                                   Estimate Std. Error   t value Pr(>|z|)
(Intercept)                       13.829033   0.001353 10219.154  < 2e-16 ***
CCVv                              -0.064097   0.001353   -47.367  < 2e-16 ***
CCTs                               0.059189   0.001353    43.739  < 2e-16 ***
N.L                                5.506568   0.001310  4202.646  < 2e-16 ***
N.Q                               -0.387064   0.001310  -295.407  < 2e-16 ***
N.C                               -0.123713   0.001310   -94.417  < 2e-16 ***
scale(dry_bio_cover_m2)           -3.096732   0.001309 -2366.243  < 2e-16 ***
tillageRT                         -0.034498   0.001310   -26.329  < 2e-16 ***
blockR2                           -1.369223   0.464429    -2.948   0.0032 **
blockR3                           -0.897212   0.001353  -663.006  < 2e-16 ***
blockR4                            0.536253   0.464366     1.155   0.2482
year2014                           0.484093   0.350696     1.380   0.1675
CCVv:N.L                          -3.795768   0.652249    -5.820 5.90e-09 ***
CCTs:N.L                          -3.760854   0.652139    -5.767 8.07e-09 ***
CCVv:N.Q                          -0.630917   0.652215    -0.967   0.3334
CCTs:N.Q                          -0.258428   0.652139    -0.396   0.6919
CCVv:N.C                           0.857098   0.652113     1.314   0.1887
CCTs:N.C                           0.273846   0.652088     0.420   0.6745
CCVv:scale(dry_bio_cover_m2)      -2.834524   0.001310 -2163.350  < 2e-16 ***
CCTs:scale(dry_bio_cover_m2)      -2.564852   0.001401 -1831.105  < 2e-16 ***
CCVv:tillageRT                     4.446872   0.001401  3174.709  < 2e-16 ***
CCTs:tillageRT                     2.758362   0.001401  1969.252  < 2e-16 ***
scale(dry_bio_cover_m2):tillageRT  2.022118   0.001310  1543.318  < 2e-16 ***








Model with variables fitted in "logical" order:

Generalized linear mixed model fit by maximum likelihood (Laplace
Approximation) ['glmerMod']
 Family: gaussian  ( sqrt )
Formula: dry_bio_weeds_m2 + 0.001 ~ block + year + scale(dry_bio_cover_m2) +
    tillage + N + CC + CC:N + CC:tillage + CC:scale(dry_bio_cover_m2)
+      tillage:scale(dry_bio_cover_m2) + (1 | block:tillage) + (1 |
    block:tillage:N) + (1 | block:tillage:N:CC) + (1 | block:year) +
   (1 | block:year:tillage) + (1 | block:year:tillage:N) + (1 |
    block:year:tillage:N:CC)
   Data: biomassCC_wo_C
Control: glmerControl(optimizer = "nloptwrap", optCtrl =
list(algorithm = "NLOPT_LN_NELDERMEAD"))

     AIC      BIC   logLik deviance df.resid
  6294.5   6416.4  -3116.2   6232.5      347

Scaled residuals:
    Min      1Q  Median      3Q     Max
-4.7838 -0.5009 -0.0011  0.5075  4.7594

Random effects:
 Groups                  Name        Variance  Std.Dev.
 block:year:tillage:N:CC (Intercept) 3.061e+04 1.749e+02
 block:tillage:N:CC      (Intercept) 2.141e-05 4.627e-03
 block:year:tillage:N    (Intercept) 2.605e-05 5.104e-03
 block:tillage:N         (Intercept) 2.106e+01 4.589e+00
 block:year:tillage      (Intercept) 4.487e-03 6.699e-02
 block:year              (Intercept) 3.822e+02 1.955e+01
 block:tillage           (Intercept) 4.228e-04 2.056e-02
 Residual                            4.504e+03 6.711e+01
Number of obs: 378, groups:
block:year:tillage:N:CC, 192; block:tillage:N:CC, 96;
block:year:tillage:N, 64; block:tillage:N, 32; block:year:tillage, 16;
block:year, 8; block:tillage, 8

Fixed effects:
                                  Estimate Std. Error t value Pr(>|z|)
(Intercept)                       13.82995   42.86352   0.323  0.74696
blockR2                           -1.36672   40.78008  -0.034  0.97326
blockR3                           -0.89843   40.78048  -0.022  0.98242
blockR4                            0.53355   40.77879   0.013  0.98956
year2014                           0.48290   28.79334   0.017  0.98662
scale(dry_bio_cover_m2)           -3.09673    0.81313  -3.808  0.00014 ***
tillageRT                         -0.03537   43.77607  -0.001  0.99936
N.L                                5.50829   43.77552   0.126  0.89987
N.Q                               -0.38899   43.77300  -0.009  0.99291
N.C                               -0.12513   43.77337  -0.003  0.99772
CCVv                              -0.06447   43.74925  -0.001  0.99882
CCTs                               0.05444   43.75387   0.001  0.99901
N.L:CCVv                          -3.79754   61.86354  -0.061  0.95105
N.Q:CCVv                          -0.62820   61.86055  -0.010  0.99190
N.C:CCVv                           0.86704   61.85967   0.014  0.98882
N.L:CCTs                          -3.76680   61.86056  -0.061  0.95145
N.Q:CCTs                          -0.25653   61.85857  -0.004  0.99669
N.C:CCTs                           0.27781   61.85861   0.004  0.99642
tillageRT:CCVv                     4.44671   61.86316   0.072  0.94270
tillageRT:CCTs                     2.75968   61.86792   0.045  0.96442
scale(dry_bio_cover_m2):CCVv      -2.83780    0.93223  -3.044  0.00233 **
scale(dry_bio_cover_m2):CCTs      -2.56887    1.32152  -1.944  0.05191 .
scale(dry_bio_cover_m2):tillageRT  2.02139    0.85983   2.351  0.01873 *


I can share the data if need be. Thank you very much for your help.

Sincerely,

Guillaume ADEUX

	[[alternative HTML version deleted]]



More information about the R-sig-mixed-models mailing list