[R-sig-ME] lmer model specification problem

Ben Bolker bbolker at gmail.com
Tue Apr 12 15:40:09 CEST 2011


Ram H. Sharma <sharma.ram.h at ...> writes:

> 
> Hi Mixed model experts:
> 
> I am new to mixed model commodity. I am tryping to specify a model using
> lmer in lme4 package. I am not sure if I am doing right, so I need your
> help....please......
> 
> Treatment / factor structure
> 
> Year:     level 1:3,  the whole the experiment was repeated in three years,
> random factor

  It is going to be hard to estimate the random-effects variance with
just three samples.  You will most likely have to estimate year as 
a fixed effect and accept that you will not be able to generalize across
years reliably.

> village: level 1:2 # the level is much higher just three are shown as
> example, random factor

  I don't know what you mean by 1:2 here.  You say "just three are
shown as example", yet you only show two.  It looks from this as though
you only have two villages, in which case the comment above applies
(but even more strongly because you have only 2 rather than 3 levels).

> Farm : level 1:9 # the level is much higher just three are shown as example,
> random factor

  This makes sense.  (Perhaps the comment about "just three are shown
as example" was accidentally copied to 'village' above?)
> 
> Variety: 10 variety were grown (may or not be different at different years,
> farm, villages, some of them were repeated) (fixed effect)
> 
> Thus layout of treatment structure would like the follows for each year -
> Year[1]
> 
> Villlage[1]
>                 Farm[1]
>                                    Variety: 1, 2, 8, 9, 6, 5
>                 Farm[2]
>                                     Variety: 6, 8, 9, 10, 4
>                 Farm[3]
> 
>                                      Variety:  1, 2, 5, 6, 3, 7
> Village[2]
> 
>                  Farm[3]
>                                    Variety: 6, 8, 3, 4, 2
>                 Farm[4]
>                                     Variety: 3, 8,1, 10, 2
>                 Farm[5]
> 
>                                      Variety:  1, 2, 3, 4, 5, 6
> 
> I am interested in interactions as well as following is the model in my
> mind:
> 
> P*ijklm* = M+Yi +Vj +YVij +F(YV)k(ij) +Gl +GYli +GVlj + GYVlij + eijklm
> (Y is for year, V = village, G = Variety, F = Farm)
> 
> I tried the following model and command, am I right?
> 
> lmer( gryld ~ 1 + (1|year) + (1|village) + (1|year:village) +
> (Farm|year:village) + variety + (1|variety:year) + (1|variety:village) +
> (1|year:variety:village) , data= mbtrail)

  I think you are confused (as is quite common) about nesting and
crossing.  Let's assume for the moment that you don't have enough
data to estimate the interaction between "variety" and village/farm
(i.e. varieties are not grown in enough different villages and farms
to estimate whether they have variable yields across villages and
farms.

  gryld ~ year+village+year:village+(1|farm:village:year)+variety

seems reasonable.  

  I have left out variety:year + variety:village + variety:year:village,
which you included in your model (see why, below).

A lot will depend on how much data you have.
If you have about 6 varieties per farm, 9 farms per village, 2 villages,
3 years, for a total of about 324 data points (from above it looks
like you may have either 5 or 6 varieties per farm per year), then
you will be limited to estimating approximately 15 to 30 parameters
(1 parameter per 10-20 data points). This means you will have to
think carefully about how to restrict your model.  In principle you
could say

  gryld ~ (year+village+variety)^3 + (year+village+variety|farm:village)

to find *all* of the interactions, but this will be far more than your
data can support.  The first model I suggested above has

 1 (intercept) + 2 (year) + 1 (village) + 2 (village:year) +
9 (variety) + 1 (farm:village) = 16 parameters.  If something
like the interaction of variety by year or variety by village is
very important to you, you could attempt to put it in, but you 
probably have to choose one or the other (variety:year = 18 additional
parameters, variety:village = 9 additional parameters).  variety:year:village
would add an additional 18 parameters on top of this.  Trying to fit 
a model with 61 parameters to a data set with 324 data points is
not going to work very well.

  Do not be tempted to throw everything in and use stepwise
approaches to discard terms that appear non-significant.
> 
> My doubt is on specially on year component? how can put that effectively?
> 
> Thank you for your time.
> 
> Ram H




More information about the R-sig-mixed-models mailing list