[R-sig-ME] mixed effect model question

Fri Jan 31 23:29:22 CET 2014

Thijs vanden Bergh <bergh.thijsvanden at ...> writes:

> 
> Dear list,
> 
> I have a number of questions concerning some mixed effects model that i am
> trying to fit on my data.
> 
> Basically the design of my experiment is as follows: we have three sites,
> low, mid and high (L, M and H; factor is called *Site*).
> At each of these three sites we have measured evapotranspiration  rates (
> *ET*; ET was measured using lysimeters, individuals were coded as *id*) for
> a number of days during summer (in total there are 15 days worth of data at
> L and M and 30 at H; factor is called *Date*).
> ET was measured for three vegetation types at all sites 
> (let say vegetation
> types are called A, B and C at all sites and the factor 
> is called *Class*).
> We simulated grazing in these three vegetation types by clipping (applied
> to half of the lysimeters of all vegetation types; factor is called
> *Treatment*).
> 
> The main aim of the excercise is to come to understand effects of
> elevation, treatment and vegetation type. Of particular interest also is
> the interaction between site and treatment as it would idicate weather the
> treatment effect depends on elevation.
> 
> One main problem is that elevation is not replicated. In the ideal case,
> site would have been a third random effect with sites at low mid and high
> elevation being replicated across different elevational gradients. One
> solution proposed to me is to take site as a random effect and then compare
> models with and without it. This will tell me that site does contribute
> significantly to the varibility in ET but it does not say anything about
> the direction of the effect. I was therefore tempted to keep site as a
> fixed effect, and then in the discussion state that effects of site are
> potentially confounded in other non altitude related phenomena (to then
> argue why this is likely or unlikely).

  I agree that you should keep site as a fixed effect.  Using site
as a random effect will work poorly in any case, for technical reasons
(it's hard to get an accurate and unbiased estimate of the among-site
variance when there are a small number of sites measured).  Since this
is a fundamental limitation of the experimental design (site effects
are confounded with elevation), it's just something you have to live
with.

> I thus started my analyses using Site, Treatment and Class as fixed
> effects. Because ET rates vary a great deal between days due to differences
> in weather conditions and due to the progressing of the season  i took id
> and date (factor) as a random effect.
> 
> something like:
> lme.v5.000    <-  lme((ET) ~ (site + class + treatment)^2,
>                     data = all.con.2009.sym,
> 
> random=list(ww=pdBlocked(list(pdIdent(~Date-1),pdIdent(~id-1)))),
>                     method="REML",
>                     na.action = na.exclude,
>                     weights=varstruc5)
> 
> the first question off course is, would this be a correct/good/defencible
> approach?
> What about including date as a random effect? is that indeed correct or
> would that not be necessary? It is highly significant.

  I think it's reasonable to treat date as random, although some
would argue against it on philosophical grounds.  If there is 'deterministic'
variation over the season (i.e. variation that could be modeled as
a linear or quadratic or spline curve), then you might want to include
date as *both* a fixed effect and a random effect; this won't change
the overall fit of the model, but will decompose among-date variation
into a smooth trend and variation around that trend.

> One further question i have is the following. There are some vegetation
> types that do not occur at all sites and / or that do not all receive the
> treatment. It seems that a mixed model does not converge on a design with
> missing levels and i was wondering if there is a workaround. I know that i
> can do pairwise comparisons using glht from the multcomp library on a model
> with one single explanatory variable (interaction(site, class, treatment)).
> Is there a way to probe the effect of class and treatment despite missing
> levels?

Recent versions of lme4 have methods, contributed by Rune Haubo, that
will automatically drop redundant columns of the fixed effect model
matrix (there is a 'check.rankX' option within lmerControl whose default
option is "message+drop.cols", i.e. drop redundant columns and generate
a message).  Alternatively you could construct your own fixed effect
model matrix, drop redundant columns yourself, cbind() your random
effects, and use those derived variables as the fixed effects in an
lme() call ...

  Ben Bolker