[R-sig-ME] Repeated measure comparisons: should the identity be a random or a fixed variable?

Wed Dec 1 12:30:55 CET 2010

Hi Billy,

There are some earlier posts regarding circular statistics and lmer  
which you could check out, although I'm not sure anything was  
resolved.  Regarding the second suggestion of fitting an observation- 
level random effect, I think it is important that this is done as a  
matter of course. Heterogeneity in the expected response across  
observations will exist even in the most carefully controlled  
experiments  and should be modeled. If it is not, the SE's on the  
fixed effects will be too small, and the variances associated with  
other random effects may well be meaningless.  For example, refit the  
example glmm model in ?lmer with an observation-level random effect  
and notice how different the conclusions are regarding the herd  
variance.

Cheers,

Jarrod

On 30 Nov 2010, at 20:26, Billy wrote:

> Hi Jarrod,
>
> Thanks for the reply.
> I followed your suggestion to treat months as continuous variable and
> the model selection result was totally different. In the first
> analyses, model 2 (considering an additive effect of x and month) was
> the most likely model, but after implemented your suggestion, the best
> model was model 1 (considering only the effect of x, with month as a
> random variable). However, I think in such case (following your
> suggestion) there is an assumption that the third month, for example,
> is in a higher level than the first or second month, right? Maybe
> that's not the case in nature and my goal is exactly investigate if
> the relationship between y and x could change from one month to
> another, not necessarily always increasing or always decreasing. In
> fact, the first sampled month is August and the last one is July of
> the next year. Assuming a seasonal variation, maybe a should use a
> circular statistics approach to deal with this case, I don't know.
> Furthermore, I don't know if I really understand your other  
> suggestion.
>
> Thanks again and sorry for some misunderstandings.
>
> Billy
>
> -- 
> Gustavo Requena
> PhD student - Laboratory of Arthropod Behavior and Evolution
> Universidade de São Paulo
> Correspondence adress:
> a/c Glauco Machado
> Departamento de Ecologia - IBUSP
> Rua do Matão - Travessa 14 no 321 Cidade Universitária, São Paulo -  
> SP, Brasil
> CEP 05508-900
> Phone number: 55 11 3091-7488
>
> http://ecologia.ib.usp.br/opilio/gustavo.html
>
>
>
> On Sat, Nov 27, 2010 at 2:26 AM, Jarrod Hadfield  
> <j.hadfield at ed.ac.uk> wrote:
>> Hi Billy,
>>
>> I think your models look reasonable, although in models 2 and 3 you  
>> may want
>>  to treat month as a continuous variable in the fixed part of the  
>> model.
>> Also, most count data are overdispersed with respect to the poisson  
>> and so a
>> model that does not account for this will be anti-conservative in  
>> terms of
>> standard errors etc. One way to deal with this is to fit an  
>> additional
>> random effect at the level of each observation:
>>
>> my.data$resid<-as.factor(1:dim(my.data)[2])
>>
>> and fit (1|resid) in the model formula.
>>
>> Cheers,
>>
>> Jarrod
>>
>>
>>
>>
>> Quoting Billy <billy.requena at gmail.com>:
>>
>>> Hello everybody!
>>>
>>> I'm relatively new at the mixed-models world and I'm facing a
>>> theoretical/philosophical problem.
>>> Let's go to my data collection.
>>>
>>> I wanna compare the number of eggs laid by females (different
>>> individuals or the same, I have no idea) at the time 1 and at the  
>>> time
>>> 2 in the same location. Therefore, I have repeated measures by
>>> location and wanna compare time 1 versus time two. Given I have  
>>> count
>>> data, to minimize the overdispersion I have considered the Poisson
>>> distribution for the errors.
>>> Furthermore, I have collected this data throughout one year and I'm
>>> also interested in temporal variation among months.
>>>
>>> model0 <- glmer ( y ~ 1 + (1|location) + (1|month),  
>>> family="poisson")
>>> model1 <- glmer ( y ~ x + (1|location) + (1|month),  
>>> family="poisson")
>>>
>>> where y = number of eggs laid,
>>>          x = factor concerning the first or the second oviposition
>>>          location = factor concerning the exactly position in the
>>> space (just an identity of the oviposition site and responsible for
>>> the repeated comparison)
>>>          month = factor concerning the month when I've collected  
>>> the data
>>>
>>> Is that right? If I wanna repeated comparison regarding specific
>>> identity of oviposition sites, should this factor (location) be a
>>> random variable?
>>>
>>> Furthermore, in both examples above, I'm just considering a temporal
>>> variation (among months) as random a effect. But I'm also interested
>>> if there are significant seasonal variation in the comparison (the
>>> difference could be higher during warm season or not even existent
>>> during cold season). Then:
>>>
>>> model2 <- glmer ( y ~ x + month + (1|location), family="poisson")
>>> model3 <- glmer ( y ~ x * month + (1|location), family="poisson")
>>>
>>> Is that right too?
>>> Finally, I'll use a model selection approach to compare the  
>>> different
>>> models and rank the most likely one to reproduce the data observed  
>>> in
>>> the nature.
>>> Thanks to everyone
>>>
>>> --
>>> Gustavo Requena
>>> PhD student - Laboratory of Arthropod Behavior and Evolution
>>> Universidade de São Paulo
>>> Correspondence adress:
>>> a/c Glauco Machado
>>> Departamento de Ecologia - IBUSP
>>> Rua do Matão - Travessa 14 no 321 Cidade Universitária, São Paulo  
>>> - SP,
>>> Brasil
>>> CEP 05508-900
>>> Phone number: 55 11 3091-7488
>>>
>>> http://ecologia.ib.usp.br/opilio/gustavo.html
>>>
>>> _______________________________________________
>>> R-sig-mixed-models at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>>
>>>
>>
>>
>>
>> --
>> The University of Edinburgh is a charitable body, registered in
>> Scotland, with registration number SC005336.
>>
>>
>

-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.