# [R] repeated values, nlme, correlation structures

Spencer Graves spencer.graves at pdf.com
Sat Nov 19 23:08:52 CET 2005

```	  You are concerned that, "using the mean of each age category as
variable leads to a loss of information regarding the variance on the
weight at each age and nestbox."  What information do you think you lose?

In particular, have you studied the residuals from your fit?  I would
guess that the you probably have heterscedasticity with the variance of
the residuals probably increasing with the age.  Plots of the absolute
residuals might help identify this.  Also, is the number of blue tits in
each age constant, or does it change, e.g., as some of the chicks die?

To try to assess how much information I lost (especially if some of
the chicks died), I might plot the weights in each nest box and connect
the dots manually, attempting to assign chick identity to the individual
numbers.  I might do it two different ways, one best fit, and another
"worst plausible".  Then I might try to fit models to these two
"augmented data sets" as if I had the true chick identity.  Then
evaluate what information you lost by using the averages AND give you a
reasonable shot at recovering that information.  If the results were
promising, I might generate more than two sets of assignments, involving

Bon Chance
Spencer Graves

Patrick Giraudoux wrote:

> Dear listers,
>
> My request of last week seems not to have drawn someone's attention.
> Suppose it was not clear enough.
>
> I am coping with an observational study where people's aim was to fit
> growth curve for a population of young blue tits. For logistic reasons,
> people have not been capable to number each individual, but they have a
> method to assess their age. Thus, nestboxes were visited occasionnally,
> youngs aged and weighted.
>
> This makes a multilevel data set, with two classification factors:
>
> - the nestbox (youngs shared the same parents and general feeding
> conditions)
> - age in each nestbox (animals from the same nestbox have been weighed
> along time, which likely leads to time correlation)
>
> Life would have been heaven if individuals were numbered, and thus nlme
> correlation structure implemented in the package be used easy. As
> mentioned above, this could not be the case. In a first approach, I
> actually used the mean weight of the youngs weighed at each age in nest
> boxes for the variable "age", and could get a nice fit with "nestbox" as
> random variable and corCAR1(form=~age|nestbox) as covariation structure.
>
> modm0c<-nlme(pds~Asym/(1+exp((xmid-age)/scal)),
>     fixed=list(Asym~1,xmid~1,scal~1),
>     random=Asym+xmid~1|nestbox,data=croispulm,
>     start=list(fixed=c(10,5,2.2)),
>     method="ML",
>     corr=corCAR1(form=~age|nestbox)
>     )
>
> Assuming that I did not commited some error in setting model parameters
> (?), this way of doing is not fully satisfying, since using the mean of
> each age category as variable  leads to a  loss of information regarding
> the variance on the weight at each age and nestbox.
>
> My question is: is there a way to handle repeated values per group (here
> several youngs in an age category in each nestbox) in such a case?
>
> I would really appreciate an answer, even negative...
>
> Kind regards,
>
> Patrick
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help

--
Spencer Graves, PhD
Senior Development Engineer
PDF Solutions, Inc.
333 West San Carlos Street Suite 700
San Jose, CA 95110, USA

spencer.graves at pdf.com
www.pdf.com <http://www.pdf.com>
Tel:  408-938-4420
Fax: 408-280-7915

```