[R-sig-eco] nlme model specification

Fri May 23 17:27:13 CEST 2008

On Thu, May 22, 2008 at 6:45 PM, Simon Blomberg <s.blomberg1 at uq.edu.au> wrote:
> On Fri, 2008-05-23 at 10:25 +0930, Caroline Lehmann wrote:
>> Hello, I would suggest reading: Prior L. D., Brook B. W., Williams R. J., Werner P. A., Bradshaw C. J. A. & Bowman D. M. J. S. (2006) Environmental and allometric drivers of tree growth rates in a north Australian savanna. Forest Ecology and Management 234, 164-80.
>>
>> In this paper tree growth was analysed and accounted for the repeated measure of individuals using either glmm or lme (now the in lmer package).
>
> lme is in the nlme package. There is no lmer package. The lmer function
> is in the lme4 package. lme and lmer are different functions, with
> overlapping functionality (lmer is not a replacement for lme). There is
> a function called glmm in the repeated package, which fits Generalized
> Linear Mixed Models. lmer can also do Generalized Linear Mixed Models,
> as can several other functions in various other packages.
>
> gls in the nlme package does NOT handle random effects.
>
> All this may sound pedantic, but a certain precision of language is
> necessary to avoid confusion, especially for beginners faced with being
> overloaded with choices for analysis methods from R's extensive
> possibilities.

No, it's not pedantic, it's very helpful -- thank you.  I hope that if
I misspeak on the list someone will point out the error.  It's the
high quality of the information on the R lists that makes them so
useful.

best,

Kingsford Jones

>
> If Year is random and the same individuals (which are random) are
> measured across years, then you have a crossed random effects structure.
> lmer is best for that, although you can trick lme into doing it
> (instructions are in Pinheiro and Bates 2000). If the growth is not
> linear, I would consider transforming the data before considering a
> nonlinear mixed-effects model. Transformation is usually pretty good at
> solving that problem, and often stabilises the variance too. And it
> doesn't cost as much in terms of the number of parameters in the model
> and the difficulty in model fitting.
>
> If the response variables are actually the same type of measurement
> (e.g. 3 different DBH measurements), then a repeated measures model is
> appropriate. If the response variables are of different types, then a
> much trickier multivariate approach would be necessary.
>
> HTH,
>
> Simon.
>
>
>> Models were compared and ranked using AICc. I would suggest modifying this to BIC since there are so many measurements.
>>
>> Kind regards, Caroline
>>
>>
>> -----Original Message-----
>> From: r-sig-ecology-bounces at r-project.org [mailto:r-sig-ecology-bounces at r-project.org] On Behalf Of Péter Sólymos
>> Sent: Friday, 23 May 2008 6:14 AM
>> To: r-sig-ecology at r-project.org
>> Subject: [R-sig-eco] Fwd: nlme model specification
>>
>> Dear List,
>> here is my response from today and yesterday to Matt's question, that was missed. I sent my first message from an unsubscribed e-mail address. Sorry for that.
>> Peter
>>
>>
>> Matt,
>> I am now absolutely confused. Am I right that you have 3 measurements per individuals per year? In other words, you measured growth (in cm?
>> or what is growth) diameter and vine load. And I think you want to partial out the variation.
>> If so, your problem became a multivariate problem, when you have 3 response variables measured on same individuals, plus some grouping variables (inds, yr). Than you can easily use multiple regression to partial the variation. I have never saw a multivariate mixed model, so I think you don't have to try too hard.
>> Let me know if I was able to understand it.
>> Yours,
>> Peter
>> ps: I don't know why my letter did not wet out to the list.
>>
>>
>> ---------- Forwarded message ----------
>> From: Péter Sólymos <Solymos.Peter at aotk.szie.hu>
>> Date: Wed, May 21, 2008 at 10:21 PM
>> Subject: Re: [R-sig-eco] nlme model specification
>> To: "Landis, R Matthew" <rlandis at middlebury.edu>, "r-sig-ecology at r-project.org"
>>
>>
>> Dear Matthew,
>>
>> I think that your case is a bit different than you proposed, since, - if I am right based on your letter - you have repeated measures for the same 300 trees over 9 successive periods (resulting in 2700 measurements). So observations are not only biased by some spatial or temporal non independence (like in case of a wildlife survey), but essentially the subjects are the same. I mean that observations are not really grouped in time. I would prefer a model with fixed model term as you wrote, a random factor like ~1| tree.individuals and an explicitly defined correlation structure with corAR1 or corARMA (or you can define groups for individuals within correlation term).
>>
>> This can be done with gls in nlme package, or glmmPQL in MASS.
>> Probably there are options in lme4 but I haven't tried those.
>>
>> The problem becomes more complicated if the growth is not linear, but
>>  follows an allometric relationship. In this case you should use nlme
>>  function. Further, there might be problems with variance homogeneity,
>>  than you shoud define a variance function, too. These are all covered
>>  in the P-B book as far as I remember.
>>
>> Hope this helps, and sorry if I made some chaos instead of a clear-cut answer.
>>
>> Best,
>>
>> Peter
>>
>> --
>> Peter Solymos, PhD
>> Institute for Biology
>> Faculty of Veterinary Science
>> Szent Istvan University, Hungary
>> http://www.univet.hu/users/psolymos/personal/
>>
>> mefa R package
>> http://mefa.r-forge.r-project.org/
>>
>> On Wed, May 21, 2008 at 7:54 PM, Landis, R Matthew <rlandis at middlebury.edu> wrote:
>> > Greetings R-eco folks,
>> >
>> > I'm trying to analyze a dataset on tree growth rates to see which factors are important (and their relative importance too, if I can get that), and I'm having some trouble figuring out how to specify the model, despite having carefully read Pinheiro and Bates, the help files for nlme, Crawley's book on Statistics with S, MASS, and other books besides.
>> >
>> > The dataset consists of ~ 300 trees measured annually for 10 years.  So, I have 9 pseudo-replicated intervals over which to assess growth (about 2700 rows in the dataset).  There are 5 different explanatory factors, which are a combination of continuous variables and categorical factors.  Some of these vary with time.  In the end, I would like to get both coefficient estimates and partial R2 (or some other way of ranking them) for each factor.  Unlike most time-series examples in the books, I am not interested in how growth varies with time, nor am I particular interested in interactions of explanatory factors with time.
>> >
>> > Based on this, I've convinced myself that I should specify the model as:
>> >
>> > fit <- lme(fixed = growth ~ (x1 + x2 + x3+ x4 + x5)^2, random =
>> > ~1|year, method = 'ML')
>> >
>> > Year is clearly a random effect, and is the grouping variable for the analysis.  Each of the other coefficients is "inner" to this variable.  I'm ignoring individual tree as a grouping factor, since I don't want to estimate separate coefficients for each tree.  Does this sound like the correct way to do this?
>> >
>> > Thanks for any help.  Apologies if this is more of a statistics question and less of an R question.
>> >
>> > Matt Landis
>> >
>> > ****************************************************
>> > R. Matthew Landis, Ph.D.
>> > Dept. Biology
>> > Middlebury College
>> > Middlebury, VT 05753
>> >
>> > tel.: 802.443.3484
>> > **************************************************
>> >
>> >
>> >        [[alternative HTML version deleted]]
>> >
>> > _______________________________________________
>> > R-sig-ecology mailing list
>> > R-sig-ecology at r-project.org
>> > https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>> >
>>
>> _______________________________________________
>> R-sig-ecology mailing list
>> R-sig-ecology at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>>
>> _______________________________________________
>> R-sig-ecology mailing list
>> R-sig-ecology at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
> --
> Simon Blomberg, BSc (Hons), PhD, MAppStat.
> Lecturer and Consultant Statistician
> Faculty of Biological and Chemical Sciences
> The University of Queensland
> St. Lucia Queensland 4072
> Australia
> Room 320 Goddard Building (8)
> T: +61 7 3365 2506
> http://www.uq.edu.au/~uqsblomb
> email: S.Blomberg1_at_uq.edu.au
>
> Policies:
> 1.  I will NOT analyse your data for you.
> 2.  Your deadline is your problem.
>
> The combination of some data and an aching desire for
> an answer does not ensure that a reasonable answer can
> be extracted from a given body of data. - John Tukey.
>
> _______________________________________________
> R-sig-ecology mailing list
> R-sig-ecology at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>