[R-sig-ME] [R] lme nesting/interaction advice

Tue May 13 13:54:20 CEST 2008

Hi Rolf,
..
On May 12, 2008, at 6:39 PM, Rolf Turner wrote:
>
> On 13/05/2008, at 4:09 AM, Douglas Bates wrote:
>
>> I'm entering this discussion late so I may be discussing issues that
>> have already been addressed.
>>
>> As I understand it, Federico, you began by describing a model for  
>> data
>> in which two factors have a fixed set of levels and one factor has an
>> extensible, or "random", set of levels and you wanted to fit a model
>> that you described as
>>
>> y ~ effect1 * effect2 * effect3
>>
>> The problem is that this specification is not complete.
>
>         At *last* (as Owl said to Rabbit) we're getting somewhere!!!
>
>         I always knew that there was some basic fundamental point
>         about this business that I (and I believe many others) were
>         simply missing.  But I could not for the life of me get anyone
>         to explain to me what that point was.  Or to put it another
>         way, I was never able to frame a question that would  
> illuminate
>         just what it was that I wasn't getting.
>
>         I now may be at a stage where I can start asking the right
>         questions.
>
>> An interaction of factors with fixed levels and a factor with random
>> levels can mean, in the lmer specification,
>>
>> lmer(y ~ effect1 * effect2 + (1| effect3) + (1|
>> effect1:effect2:effect3), ...)
>>
>> or
>>
>> lmer(y ~ effect1 * effect2 + (effect1*effect2 | effect3), ...)
>>
>> or other variations.  When you specify a random effect or an random
>> interaction term you must, either explicitly or implicitly, specify
>> the form of the variance-covariance matrix associated with those
>> random effects.
>>
>> The "advantage" that other software may provide for you is that it
>> chooses the model for you but that, of course, means that you only
>> have the one choice.
>
>         Now may I start asking what I hope are questions that will  
> lift
>         the fog a bit?
>
>         Let us for specificity consider a three-way model with two
>         fixed effects and one random effect from the good old  
> Rothamstead style
>         agricultural experiment context:  Suppose we have a number of
>         species/breeds of wheat (say) and a number of fertilizers.
>         These are fixed effects.  And we have a number of fields  
> (blocks)
>         --- a random effect.  Each breed-fertilizer combination is
>         applied a number of times in each field.  We ***assume*** that
>         that the field or block effect is homogeneous throughout.   
> This
>         may or may not be a ``good'' assumption, but it's not  
> completely
>         ridiculous and would often be made in practice.  And probably
>         *was* made at Rothamstead.  The response would be something  
> like
>         yield in bushels per acre.
>
>         The way that I would write the ``full'' model for this  
> setting,
>         in mathematical style is:
>
>         Y_ijkl = mu + alpha_i + beta_j + (alpha.beta)_ij + C_k +  
> (alpha.C)_ik
>                      + (beta.C)_jk + (alpha.beta.C)_ijk + E_ijkl
>
>         The alpha_i and beta_j are parameters corresponding to  
> breed and
> fertilizer
>         respectively; the C_k are random effects corresponding to  
> fields or
> blocks.
>         Any effect ``involving'' C is also random.
>
>         The assumptions made by the Package-Which-Must-Not-Be-Named  
> are (I
> think)
>         that
>
>                 C_k ~ N(0,sigma_C^2)
>                 (alpha.C)_ik ~ N(0,sigma_aC^2)
>                 (beta.C)jk ~ N(0,sigma_bC^2)
>                 (alpha.beta.C)_ijk ~ N(0,sigma_abC^2)
>                 E_ijkl ~ N(0,sigma^2)
>
>         and these random variables are *all independent*.
>
>         Ahhhhhhhh ... perhaps I'm on the way to answering my own  
> question.  Is
>         it this assumption of ``all independent'' which is  
> questionable?  It
>         seemed innocent enough when I first learned about this  
> stuff, lo these
>         many years ago.  But .... mmmmmaybe not!
>
>         To start with:  What would be the lmer syntax to fit the  
> foregoing
>         (possibly naive) model?  I am sorry, but I really cannot  
> get my head
>         around the syntax of lmer model specification, and I've tried.
Did you answer your own question, (or what am I missing something):
Y ~ a*b +(1|C) + (1|aC) + (1|bC) + (1|abC)  plus of course epsilon.ijkl

Regarding interdependence - yikes. I don't even know what that would  
mean. In part, it seems that would mean non-zero variance components  
for  (1|aC), (1|bC), and (1|abC). Beyond that, and beyond simple  
heterogenous variances, I am even more lost.

Hank
> I
>         really have.  Hard.  I know I must be starting from the  
> wrong place,
>         but I haven't a clue as to what the *right* place to start  
> from is.
>         And if I'm in that boat, I will wager Euros to pretzels  
> that there
>         are others in it.  I know that I'm not the brightest bulb  
> in the
>         chandelier, but I'm not the dullest either.
>
>         Having got there:  Presuming that I'm more-or-less on the  
> right track
>         in my foregoing conjecture that it's the over-simple  
> dependence
> structure
>         that is the problem with what's delivered by the Package- 
> Which-Must-
> Not-Be-Named,
>         how might one go about being less simple-minded?  I.e. what  
> might be
> some
>         more realistic dependence structures, and how would one  
> specify
> these in lmer?
>         And how would one assess whether the assumed dependence  
> structure gives
>         a reasonable fit to the data?
>
>> If you can describe how many variance components you think should be
>> estimated in your model and what they would represent then I think it
>> will be easier to describe how to fit the model.
>
>         How does this fit in with my conjecture (above) about what  
> I've been
>         missing all these years?  Does it fit?  How many variance  
> components
>         are there in the ``naive'' model?  It looks like 5 to  
> me ... but maybe
>         I'm totally out to lunch in what I think I'm understanding  
> at this
> stage.
>         (And besides --- there are three sorts of statistician;  
> those who
> can count,
>         and those who can't.)
>
>         Thank you for your indulgence.
>
>                  cheers,
>
>                         Rolf Turner
>
> ######################################################################
> Attention:\ This e-mail message is privileged and confid... 
> {{dropped:9}}
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting- 
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

Dr. Hank Stevens, Associate Professor
338 Pearson Hall
Botany Department
Miami University
Oxford, OH 45056

Office: (513) 529-4206
Lab: (513) 529-4262
FAX: (513) 529-4243
http://www.cas.muohio.edu/~stevenmh/
http://www.users.muohio.edu/harkesae/
http://www.cas.muohio.edu/ecology
http://www.muohio.edu/botany/
"E Pluribus Unum"

"I love deadlines. I love the whooshing noise they make as they go by."
                                             (Douglas Adams)

If you send an attachment, please try to send it in a format anyone  
can read, such as PDF, text, Open Document Format, HTML, or RTF.
Why? See:  http://www.gnu.org/philosophy/no-word-attachments.html