[R] repeated values, nlme, correlation structures

Sun Nov 20 08:22:15 CET 2005

Spencer Graves a écrit :

>       You are concerned that, "using the mean of each age category as 
> variable leads to a loss of information regarding the variance on the 
> weight at each age and nestbox."  What information do you think you lose?

The variance  around the mean weight of each age category. This 
variation is a priori not considered in the model when using the mean 
only, and not each value used to compute the mean..

>
>       In particular, have you studied the residuals from your fit?  I 
> would guess that the you probably have heterscedasticity with the 
> variance of the residuals probably increasing with the age.  Plots of 
> the absolute residuals might help identify this.  

Yes, of course. At this stage using a  Continuous AR(1) as Correlation 
Structure, reduces considerably heteroscedasticity up to quasi-normal.

> Also, is the number of blue tits in each age constant, or does it 
> change, e.g., as some of the chicks die?

Yes, unfortunately, it may happen eventually.

>
>       To try to assess how much information I lost (especially if some 
> of the chicks died), I might plot the weights in each nest box and 
> connect the dots manually, attempting to assign chick identity to the 
> individual numbers.  I might do it two different ways, one best fit, 
> and another "worst plausible".  Then I might try to fit models to 
> these two "augmented data sets" as if I had the true chick identity.  
> Then comparing these fits with the one you already have should help 
> you evaluate what information you lost by using the averages AND give 
> you a reasonable shot at recovering that information.  If the results 
> were promising, I might generate more than two sets of assignments, 
> involving other people in that task.

OK, should not be that difficult (actually the data were given with 
pseudo-ID numbers on each chicks and I started with this... until I 
learned they were corresponding to nothing). I suppose one could go as 
far as possible with the "worst possible" with random assignements and 
permutations, and thus comparing the fits.

Many thanks for the hint. I was really wondering what may mean no answer 
on the list... Problem not clear enough, trivial solution or real 
trouble for statisticians with such data? Quite  scaring to a 
biologist...  Now, I am fixed.

> If the results were promising, I might generate more than two sets of 
> assignments, involving other people in that task. 

Of course if some capable mixed-effect models specialist is interested 
in having a look to the data set, I can send it off list.

Many thanks again, Spencer, I can stick on the track, now...

Best regards,

Patrick

>       Bon Chance
>       Spencer Graves
>
> Patrick Giraudoux wrote:
>
>> Dear listers,
>>
>> My request of last week seems not to have drawn someone's attention. 
>> Suppose it was not clear enough.
>>
>> I am coping with an observational study where people's aim was to fit 
>> growth curve for a population of young blue tits. For logistic 
>> reasons, people have not been capable to number each individual, but 
>> they have a method to assess their age. Thus, nestboxes were visited 
>> occasionnally, youngs aged and weighted.
>>
>> This makes a multilevel data set, with two classification factors:
>>
>> - the nestbox (youngs shared the same parents and general feeding 
>> conditions)
>> - age in each nestbox (animals from the same nestbox have been 
>> weighed along time, which likely leads to time correlation)
>>
>> Life would have been heaven if individuals were numbered, and thus 
>> nlme correlation structure implemented in the package be used easy. 
>> As mentioned above, this could not be the case. In a first approach, 
>> I actually used the mean weight of the youngs weighed at each age in 
>> nest boxes for the variable "age", and could get a nice fit with 
>> "nestbox" as random variable and corCAR1(form=~age|nestbox) as 
>> covariation structure.
>>
>> modm0c<-nlme(pds~Asym/(1+exp((xmid-age)/scal)),
>>     fixed=list(Asym~1,xmid~1,scal~1),
>>     random=Asym+xmid~1|nestbox,data=croispulm,
>>     start=list(fixed=c(10,5,2.2)),
>>     method="ML",
>>     corr=corCAR1(form=~age|nestbox)
>>     )
>>
>> Assuming that I did not commited some error in setting model 
>> parameters (?), this way of doing is not fully satisfying, since 
>> using the mean of each age category as variable  leads to a  loss of 
>> information regarding the variance on the weight at each age and 
>> nestbox.
>>
>> My question is: is there a way to handle repeated values per group 
>> (here several youngs in an age category in each nestbox) in such a case?
>>
>> I would really appreciate an answer, even negative...
>>
>> Kind regards,
>>
>> Patrick
>>
>> ______________________________________________
>> R-help at stat.math.ethz.ch mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide! 
>> http://www.R-project.org/posting-guide.html
>
>