[R-sig-ME] mixed models with very few measurements by subject

Sat Nov 9 13:23:05 CET 2019

Dear Tim,

Abundance is probably a count variable. If so, consider using a
distribution that handles count variables (e.g. Poisson, negative binomial,
...).

I'll use lme4 notation.

1 . You can either use landscape + (1|sample_location) or
just (1|sample_location). The main difference is the first model fit the
common landscape effect via the landscape variable whereas those effects
are handled by (1|sample_location) in the second model (given
sample_location is nested in landscape).
2. Since you have only three landscape classes, I recommend to keep the
landscape as a fixed effect
3. You could fit a variogram on the BLUP of the sample_location to see if
there is spatial autocorrelation. The value of take spatial autocorrelation
into account will depend on the strength of the spatial autocorrelation.

Best regards,

ir. Thierry Onkelinx
Statisticus / Statistician

Vlaamse Overheid / Government of Flanders
INSTITUUT VOOR NATUUR- EN BOSONDERZOEK / RESEARCH INSTITUTE FOR NATURE AND
FOREST
Team Biometrie & Kwaliteitszorg / Team Biometrics & Quality Assurance
thierry.onkelinx using inbo.be
Havenlaan 88 bus 73, 1000 Brussel
www.inbo.be

///////////////////////////////////////////////////////////////////////////////////////////
To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to say
what the experiment died of. ~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data. ~ Roger Brinner
The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of data.
~ John Tukey
///////////////////////////////////////////////////////////////////////////////////////////

<https://www.inbo.be>

Op vr 8 nov. 2019 om 17:04 schreef Tim Richter-Heitmann <
trichter using uni-bremen.de>:

> Dear list,
>
> sorry for bothering.
>
> I was presented this type of data:
>
> Abundance = my response variable, 300 observations per species, no NAs
>
> habitat_type = a fixed effect, a factor with 9 levels.
>
> sample_location = a random effect, a factor with 150  levels. I assume
> there is enough unmeasured variability to warrant this as a random factor
>
> landscape = another random factor of three levels, in which
> sample_location is nested within.
>
> Notice that not every sample_location or landscape contains all levels
> of habitat_type.
>
> Every sample_location was measured twice with an interval of 1 year
> inbetween.  In principle, this can be coded as factor as well, to
> account for temporal variability. Initial analysis showed there is very
> little temporal variability.
>
> But then i am left with only one observation per location, and i was
> reading
> (
> https://stats.stackexchange.com/questions/242821/how-will-random-effects-with-only-1-observation-affect-a-generalized-linear-mixe)
>
> that this way
>
> residual errors and random effects may be confounded. Landscape has 50
> observations, but only three groups, which i think is also not a wise
> option, as per
>
> https://stats.stackexchange.com/questions/37647/what-is-the-minimum-recommended-number-of-groups-for-a-random-effects-factor
> .
>
> I am interested in Abundance ~ habitat_type and if there are differences
> in abundance means. I first totally ignored the existence of
> sample_location:
>
> mod <- aov(Abundance~habitat_type); res <- glht(mod,
> mcp(habitat_type="Tukey", vcov=vcovHC).
>
> And then i compared this to
>
>   amod <- lme(fixed=Abundance~habitat_type, data = D, random =
> ~1|sample_location , method="ML") ;  means <- emmeans(amod, ~habitat_type)
>
> There are very few differences between the two approaches. I also
> ignored landscape at this level.
>
> My Questions:
>
> 1. Are sample_location (many subjects, few observations) and landscape
> (few groups, many observations) suitable candidates to be modelled as a
> random effect?
>
> 2. Can their nestedness save me, and how would i code
> Landscape:sample_location?
>
> 3. Would it better to code the locations as coordinates and check for
> different correlation structures in gls?
>
>
> Thank you for your kind advice!
>
>
> --
> Dr. Tim Richter-Heitmann
>
> University of Bremen
> Microbial Ecophysiology Group (AG Friedrich)
> FB02 - Biologie/Chemie
> Leobener Straße (NW2 A2130)
> D-28359 Bremen
> Tel.: 0049(0)421 218-63062
> Fax: 0049(0)421 218-63069
>
> _______________________________________________
> R-sig-mixed-models using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>

	[[alternative HTML version deleted]]