[R-sig-ME] Random effect 'overlap' with fixed effect

Thierry Onkelinx thierry.onkelinx at inbo.be
Wed Jan 18 13:30:10 CET 2017


Dear Chung-Huey,

You have 6 x 4 observations per site - year combination. That is sufficient
to use 1|site/year as random effect.
Site is an essential part of the design of your study. Therefore it should
be in the model. The site effect will take up information which is not
explained by the altitude. So if the altitude effect is smaller in
comparison to the site effect, then the site-to-site variation is more
important than the altitude effect.

Note that lots of zero's does not imply zero inflation see
https://rpubs.com/INBOstats/zeroinflation

Best regards,

ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature and
Forest
team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
Kliniekstraat 25
1070 Anderlecht
Belgium

To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to say
what the experiment died of. ~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data. ~ Roger Brinner
The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of data.
~ John Tukey

2017-01-18 5:59 GMT+01:00 Chung-Huey Wu <chung.huey.wu op gmail.com>:

> Hello all,
>
> This is my first post in r-sig-ME. I have benefited and learned a lot from
> the threads here. In this email I would like to ask for comments on the
> correct way to specify my random effect term (sampling site), which
> 'overlap' with the fixed effect of main interest (altitude).
>
> I am now using nlme, lme4, and glmmADMB to analyze altitudinal survey data
> of plant traits.
>
> Below is the structure of the data I am now have trouble analyzing:
> =======================================
> a) Research design: In every Jan, Apr, Jul, Oct from Jul-2012 to Oct-2015,
> we visited the 3 sites at low altitude and 3 sites at medium altitude,
> haphazardly selected 6 individual plants of the focal species, and measured
> traits and # herbivore for each individual.
>
> b) Hypothesis to test:  Does the trait/# herbivore vary across altitude and
> month? Are there interactions?
>
> c) Notations:
> alt (altitude): L (low) vs. M (medium); a 2-level factor.
> m (month): Jan, Apr, Jul, Oct; a 4-level factor.
> site (sampling site): 3 at low- and 3 at medium-altitude Taiwan; a 6-level
> factor with unique location names.
> y (year of survey): 2012-2015; a 4-level factor.
> --------
> C_count (# chewing herbivore):  discrete (count) response variable. Many
> zeros, mostly 1-3, a few large values (10,17,22).
> mT (physical leaf toughness): continuous response variable.
>
>
> d) My model structure and outputs:
> mod_T = lme(mT ~ m + alt + m:alt, random = ~ 1|site/y,
>                  weights = varIdent(form = ~1|alt),
>                  contrasts = c(list(m=contr.sum,alt=contr.sum)),
>                  na.action = "na.omit", data = dat))
> #### weights model form (1|alt) is AIC-selected to control
> for heteroskedasticity.
>
> Linear mixed-effects model fit by REML
>  Data: dat
>        AIC      BIC    logLik
>   3533.131 3591.853 -1752.565
>
> Random effects:
> Formula: ~1 | site
>              (Intercept)
> StdDev:    5.603294
>
> Formula: ~1 | y %in% site
>                (Intercept)  Residual
> StdDev:    4.142521  9.003203
>
> Variance function:
>  Structure: Different standard deviations per stratum
>  Formula: ~1 | m
>  Parameter estimates:
>      July       Oct       Jan       Apr
> 1.0000000 0.7836473 0.7696658 1.0022467
>
> Fixed effects: mT ~ m + alt + m:alt
>                   Value     Std.Error  DF    t-value      p-value
> (Intercept) 39.38305  3.590926  468 10.967380  0.0000
> mApr         7.57426   1.548235  468  4.892190   0.0000
> mJuly        8.44167   1.457696  468  5.791103   0.0000
> mOct         4.19561   1.300168  468  3.226973   0.0013
> altM          -2.72082   5.093589   4   -0.534166   0.6215
> mApr:altM   -2.31793  2.224683 468 -1.041916   0.2980
> mJuly:altM  -3.14538  2.098787 468 -1.498666   0.1346
> mOct:altM   -3.09390  1.880431 468 -1.645314   0.1006
>  Correlation:
>            (Intr) mApr   mJuly  mOct   altM   mApr:M mJly:M
> mApr       -0.160
> mJuly      -0.191  0.394
> mOct       -0.214  0.442  0.527
> altM       -0.705  0.113  0.135  0.151
> mApr:altM   0.111 -0.696 -0.274 -0.307 -0.171
> mJuly:altM  0.133 -0.274 -0.695 -0.366 -0.201  0.414
> mOct:altM   0.148 -0.305 -0.364 -0.691 -0.225  0.462  0.546
>
> Standardized Within-Group Residuals:
>         Min          Q1         Med          Q3         Max
> -2.49626272 -0.70206429 -0.03138199  0.59483874  3.00467177
>
> Number of Observations: 498
> Number of Groups:
>        site y %in% site
>           6          24
>
> ------------------------------------------------------------
> ------------------------------
> mod_C = glmmadmb(C_count ~ m + alt + m:alt + (1|site/y),
>                     family = "nbinom",
>                     zeroInflation = TRUE, data= dat)
> #### AIC(nbinom model) << AIC(poisson model), so use nbinom
>
> AIC: 686.6
>
> Coefficients:
>                    Estimate Std. Error  z value  Pr(>|z|)
> (Intercept)  -0.70537    0.47722   -1.48    0.1394
> mApr          0.00304    0.63689    0.00    0.9962
> mJuly         -1.24819    0.63544   -1.96   0.0495 *
> mOct         -0.36723    0.60261   -0.61    0.5423
> altM           -1.63873    0.73215   -2.24   0.0252 *
> mApr:altM   0.81522    0.94844    0.86    0.3900
> mJuly:altM  3.21364    0.88936    3.61    0.0003 ***
> MonthOct:AltM   1.21148    0.89989    1.35    0.1782
> ---
> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
> Number of observations: total=499, Site=6, Site:Year=24
> Random effect variance(s):
> Group=Site
>             Variance StdDev
> (Intercept)  0.03713 0.1927
> Group=Site:Year
>             Variance StdDev
> (Intercept)   0.2289 0.4785
>
> Negative binomial dispersion parameter: 0.17887 (std. err.: 0.036541)
> Zero-inflation: 1.0132e-06  (std. err.:  0.00013482 )
>
> Log-likelihood: -331.319
> Warning message:
> In .local(x, sigma, ...) :
>   'sigma' and 'rdig' arguments are present for compatibility only: ignored
> ================================================
>
> My Questions:
> 1) Is it reasonable to include (1|site/y)? Since site effect is mixed up
> with altitude effect (my main focus), I am worried that adding 'site' as
> random effect would disturb the estimation of altitude effect (which seems
> strong based on plotted data), or even cause estimation problem?  (for
> example, in summary(mod_T), the DF of the 'altM' term is so small (4)
> compared to models without site random effect (~480, attached below for
> your reference)).
> 2) One step back, are lme() (for controlling heteroskedasticity) and
> glmmadmb() (for accounting for zero-inflation) the right choices for this
> analysis?
>
>
> I would appreciate your comments and help.
> Thank very much!
>
> Chung-Huey Wu
>
>
> ####################################################
> [WITHOUT site random effect]
> mod_T = lme(mT ~ m + alt + m:alt, random = ~ 1|y,
>                  weights = varIdent(form = ~1|alt),
>                  contrasts = c(list(m=contr.sum,alt=contr.sum)),
>                  na.action = "na.omit", data = dat))
>
> Linear mixed-effects model fit by REML
>  Data: dat
>        AIC      BIC   logLik
>   3672.361 3726.888 -1823.18
>
> Random effects:
>  Formula: ~1 | y
>         (Intercept) Residual
> StdDev:    3.515026 9.344131
>
> Variance function:
>  Structure: Different standard deviations per stratum
>  Formula: ~1 | m
>  Parameter estimates:
>      July       Oct       Jan       Apr
> 1.0000000 1.0896624 0.9214424 1.0804838
> Fixed effects: mT ~ m + alt + m:alt
>                Value Std.Error  DF   t-value p-value
> (Intercept) 39.65193  2.135613 487 18.567007  0.0000
> mApr         7.57426  1.805682 487  4.194680  0.0000
> mJuly        8.17279  1.638495 487  4.987985  0.0000
> mOct         3.92672  1.706423 487  2.301144  0.0218
> altM        -3.48711  1.709890 487 -2.039375  0.0420
> mApr:altM   -1.32758  2.588246 487 -0.512928  0.6082
> mJuly:altM  -2.37910  2.312807 487 -1.028662  0.3041
> mOct:altM   -2.32762  2.409046 487 -0.966198  0.3344
>  Correlation:
>            (Intr) mApr   mJuly  mOct   altM   mApr:M mJly:M
> mApr       -0.356
> mJuly      -0.421  0.464
> mOct       -0.404  0.446  0.526
> altM       -0.376  0.445  0.490  0.471
> mApr:altM   0.248 -0.698 -0.324 -0.311 -0.661
> mJuly:altM  0.278 -0.329 -0.682 -0.348 -0.739  0.488
> mOct:altM   0.267 -0.316 -0.348 -0.684 -0.710  0.469  0.525
>
> Standardized Within-Group Residuals:
>        Min         Q1        Med         Q3        Max
> -2.2019475 -0.7909630 -0.0548838  0.7254892  3.0912138
>
> Number of Observations: 498
> Number of Groups: 4
> ####################################################
>
>         [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-mixed-models op r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

	[[alternative HTML version deleted]]



More information about the R-sig-mixed-models mailing list