[R-sig-eco] nlme model specification

Thu May 22 16:56:40 CEST 2008

Matt,
I believe that is what you would be doing if you group by tree. e.g.

## vary only intercept
mdl1<-lmer(growth~diameter+vineload+(1|tree))
## vary intercept and slope for diameter
mdl2<-lmer(growth~diameter+vineload+(1+diameter|tree))
## vary intercept and slope for diameter and vineload
mdl2<-lmer(growth~diameter+vineload+(1+diameter+vineload|tree))

if the last two dont converge, and based on your lme results they 
probably wont, try the lmer2 function, it wont converge either but it 
should tell you why.

-Chris

Landis, R Matthew wrote:
> Chris -
> I don't think you are wrong at all -- you're right on the mark.  I think this is the essential problem I am trying to solve, but the problem for me lies in your statement "a repeated measures model of some sort".  But which sort?  So far I have not been smart enough to figure out how to specify it properly.  Maybe I am far off the mark with my current approach?
>
> Matt
>
>   
>> -----Original Message-----
>> From: Christian A. Parker [mailto:cparker at pdx.edu]
>> Sent: Thursday, May 22, 2008 10:39 AM
>> To: Landis, R Matthew
>> Cc: 'r-sig-ecology at r-project.org'
>> Subject: Re: [R-sig-eco] nlme model specification
>>
>> Matthew
>> Please correct me if I am wrong (anyone) but because your observations
>> are not independent across your desired groups (years) your error terms
>> will be biased which will then influence your significant tests. So
>> regardless of the factor that you are interested you would
>> still want to
>> account for the fact that all measurements were taken on the same trees
>> each year by doing a repeated measures model of some sort.
>> Hope this helps,
>> -Chris
>>
>> Landis, R Matthew wrote:
>>     
>>> Dear R-sig-eco:
>>>
>>> Many thanks to all of those who took the time to reply to my
>>>       
>> question.  The diversity of replies has made me go back and
>> try to clarify my question.  Apologies for the length of the
>> e-mail.  Thanks in advance to anyone willing to plow through
>> this and understand it.  If you're ever in Middlebury I'll buy
>> you a beer.
>>     
>>> To repeat, I have 300 trees, ranging in size from 10 - 150
>>>       
>> cm diameter (big trees).  To simplify my original question,
>> let's say I want to understand the relationship between growth
>> and two variables, diameter (continuous) and vine load
>> (ordinal index from 1-4). I'd also like to know the relative
>> importance of diameter vs. vine load, e.g. by partial R2.  If
>> I had one year of data, this would be a simple regression.
>>     
>>> However, I have 9 years of annual measurements on the trees.
>>>       
>> It's as if I have the above analysis repeated 9 times.  There
>> was no initial treatment, so I view these 9 years as a random
>> sample of the years in the life of the tree, and unlike most
>> examples of repeated measures I have read, the time effect is
>> of no interest whatsoever. That is, I am not interested in
>> viewing xyplot(growth ~ time|id).  I don't expect to see any
>> consistent directional response to time.  In a way, it's as if
>> the 9 years represent blocks, (except that it's the same 300
>> trees in each block) -- this is why I view the yr as a random
>> effect, and as the grouping variable.
>>     
>>> If I were to graph the data, I would use xyplot(growth ~
>>>       
>> diameter|yr) to see what I am most interested in.  Grouping by
>> individual doesn't make sense to me here because each
>> individual only represents a very small slice of the full
>> range of measurements - e.g. over the ten years, each tree
>> only grows from 10 cm - 14 cm, so I can't really estimate the
>> growth vs. diameter relationship for each tree.  xyplot(growth
>> ~ diameter|id) would not be useful. This is why I don't
>> consider the individual to be the grouping variable, but
>> perhaps I am wrong on this.
>>     
>>> So, now, as before, I am back to
>>>
>>> fit <- lme(fixed = growth ~ diameter * vines, random = ~ 1|year)
>>>
>>> I'm expecting that this will estimate separate intercepts
>>>       
>> for each year.  Which is what I want (I would like to fit
>> separate slopes by year too, but that model didn't converge).
>>     
>>> I guess what I'm most concerned about is whether the
>>>       
>> significance tests obtained for each term use the appropriate
>> error term and the appropriate degrees of freedom.  I'm
>> currently using something like the following command to test
>> the effect of diameter
>>     
>>> anova(fit.full.model, update(fit.full.model, . ~ vines))
>>>
>>> But maybe I'm way off base there.
>>>
>>> Thanks very much!
>>>
>>> Matt Landis
>>>
>>>
>>>       
>>>> -----Original Message-----
>>>> From: r-sig-ecology-bounces at r-project.org
>>>> [mailto:r-sig-ecology-bounces at r-project.org] On Behalf Of
>>>> Landis, R Matthew
>>>> Sent: Wednesday, May 21, 2008 1:55 PM
>>>> To: 'r-sig-ecology at r-project.org'
>>>> Subject: [R-sig-eco] nlme model specification
>>>>
>>>> Greetings R-eco folks,
>>>>
>>>> I'm trying to analyze a dataset on tree growth rates to see
>>>> which factors are important (and their relative importance
>>>> too, if I can get that), and I'm having some trouble figuring
>>>> out how to specify the model, despite having carefully read
>>>> Pinheiro and Bates, the help files for nlme, Crawley's book on
>>>> Statistics with S, MASS, and other books besides.
>>>>
>>>> The dataset consists of ~ 300 trees measured annually for 10
>>>> years.  So, I have 9 pseudo-replicated intervals over which to
>>>> assess growth (about 2700 rows in the dataset).  There are 5
>>>> different explanatory factors, which are a combination of
>>>> continuous variables and categorical factors.  Some of these
>>>> vary with time.  In the end, I would like to get both
>>>> coefficient estimates and partial R2 (or some other way of
>>>> ranking them) for each factor.  Unlike most time-series
>>>> examples in the books, I am not interested in how growth
>>>> varies with time, nor am I particular interested in
>>>> interactions of explanatory factors with time.
>>>>
>>>> Based on this, I've convinced myself that I should specify the
>>>> model as:
>>>>
>>>> fit <- lme(fixed = growth ~ (x1 + x2 + x3+ x4 + x5)^2, random
>>>> = ~1|year, method = 'ML')
>>>>
>>>> Year is clearly a random effect, and is the grouping variable
>>>> for the analysis.  Each of the other coefficients is "inner"
>>>> to this variable.  I'm ignoring individual tree as a grouping
>>>> factor, since I don't want to estimate separate coefficients
>>>> for each tree.  Does this sound like the correct way to do this?
>>>>
>>>> Thanks for any help.  Apologies if this is more of a
>>>> statistics question and less of an R question.
>>>>
>>>> Matt Landis
>>>>
>>>> ****************************************************
>>>> R. Matthew Landis, Ph.D.
>>>> Dept. Biology
>>>> Middlebury College
>>>> Middlebury, VT 05753
>>>>
>>>> tel.: 802.443.3484
>>>> **************************************************
>>>>
>>>>
>>>>        [[alternative HTML version deleted]]
>>>>
>>>> _______________________________________________
>>>> R-sig-ecology mailing list
>>>> R-sig-ecology at r-project.org
>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>>>>
>>>>
>>>>         
>>> _______________________________________________
>>> R-sig-ecology mailing list
>>> R-sig-ecology at r-project.org
>>> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>>>
>>>
>>>       
>
>