[R-sig-ME] mixed models with very few measurements by subject
Tim Richter-Heitmann
tr|chter @end|ng |rom un|-bremen@de
Fri Nov 8 17:03:33 CET 2019
Dear list,
sorry for bothering.
I was presented this type of data:
Abundance = my response variable, 300 observations per species, no NAs
habitat_type = a fixed effect, a factor with 9 levels.
sample_location = a random effect, a factor with 150 levels. I assume
there is enough unmeasured variability to warrant this as a random factor
landscape = another random factor of three levels, in which
sample_location is nested within.
Notice that not every sample_location or landscape contains all levels
of habitat_type.
Every sample_location was measured twice with an interval of 1 year
inbetween. In principle, this can be coded as factor as well, to
account for temporal variability. Initial analysis showed there is very
little temporal variability.
But then i am left with only one observation per location, and i was
reading
(https://stats.stackexchange.com/questions/242821/how-will-random-effects-with-only-1-observation-affect-a-generalized-linear-mixe)
that this way
residual errors and random effects may be confounded. Landscape has 50
observations, but only three groups, which i think is also not a wise
option, as per
https://stats.stackexchange.com/questions/37647/what-is-the-minimum-recommended-number-of-groups-for-a-random-effects-factor.
I am interested in Abundance ~ habitat_type and if there are differences
in abundance means. I first totally ignored the existence of
sample_location:
mod <- aov(Abundance~habitat_type); res <- glht(mod,
mcp(habitat_type="Tukey", vcov=vcovHC).
And then i compared this to
amod <- lme(fixed=Abundance~habitat_type, data = D, random =
~1|sample_location , method="ML") ; means <- emmeans(amod, ~habitat_type)
There are very few differences between the two approaches. I also
ignored landscape at this level.
My Questions:
1. Are sample_location (many subjects, few observations) and landscape
(few groups, many observations) suitable candidates to be modelled as a
random effect?
2. Can their nestedness save me, and how would i code
Landscape:sample_location?
3. Would it better to code the locations as coordinates and check for
different correlation structures in gls?
Thank you for your kind advice!
--
Dr. Tim Richter-Heitmann
University of Bremen
Microbial Ecophysiology Group (AG Friedrich)
FB02 - Biologie/Chemie
Leobener Straße (NW2 A2130)
D-28359 Bremen
Tel.: 0049(0)421 218-63062
Fax: 0049(0)421 218-63069
More information about the R-sig-mixed-models
mailing list