[R-sig-ME] multiple nested random factors

Amanda Adams aadams26 at uwo.ca
Fri Feb 22 18:23:24 CET 2013


Thank you for the response Dr. Bolker.

On 22/02/2013 9:05 AM, Ben Bolker wrote:
> Amanda Adams <aadams26 at ...> writes:
>
>> I have been having a heck of a time figuring out how to estimate the
>> proportion of variance from several random factors. I have a count data
>> of the number of bat calls recorded at 3 sites, on 6 detectors, over 12
>> nights. Detectors were at 2 heights.
>> If I understand nested factors correctly, Detectors are nested in Site
>> and Night is nested in Site.
>> Site/Detector and Site/Night are random
>> factors and Height is a fixed factor.
>    It's still not entirely clear to me from this description how
> your data are structured.  You have an average of about 249/12 ~ 21
> observations per night, so I'm going to assume you have 6 detectors
> *at each site*.  Detector will be nested in site (because it doesn't
> make any sense to analyze what happens at "detector number 1" unless
> the detectors are somehow arranged so that the set of (d1:site1,
> d1:site2, d1:site3, ... has something in common).  You *may* want
> a night:site interaction (if you have enough data), but in principle
> you also want a site factor (probably fixed, since there are only
> three levels) and a night factor.  This would be
>
>    ~ height + f.Site + (1|f.Night/f.Site) + (1|f.Site:f.Detector)
>
>    It is quite likely that you will find some of these variance
> components estimated as zero ...
>    
Yes, I have 6 detectors at each site.
>
>> Also, data is overdispersed so I am transforming number of calls as
>> log(Calls+1).
>    This makes no sense (sorry).  Poisson models must have a response
> variable that is a raw count value (integer).  How do you know the
> data are overdispersed before you fit a model ???  (Although I do see
> that you have widely varying values in your 'Calls' variable, so
> you may be right ...)
>
>    For various ways of handling overdispersion in GLMMs see
> http://glmm.wikidot.com/faq
I had tested for overdispersion with qcc.overdispersion.test in qcc 
package.  I had tried using an individual-level random effect to capture 
overdispersion, but was not sure how to interpret the data once that was 
included.
>    I don't know if it's helpful, but Bolker et al. 2009 _Trends
> in Ecology and Evolution_ might be a citeable source for GLMMs.
> It doesn't really say anything specific about Poisson variables
> and why a Poisson model doesn't include a residual variance; for
> that you should probably cite (after reading!) a basic book
> on generalized linear models.
This paper has been very helpful and was the reason I was initially 
using glmer. Thanks! I will do some more reading.
>
>> 'data.frame':   249 obs. of  11 variables:
>>    $ Night     : int  1 3 5 11 12 1 3 5 11 12 ...
>>    $ Night2    : int  1 2 3 4 5 1 2 3 4 5 ...
>>    $ Site      : int  1 1 1 1 1 1 1 1 1 1 ...
>>    $ Species   : int  1 1 1 1 1 1 1 1 1 1 ...
>>    $ Detector  : int  1 1 1 1 1 2 2 2 2 2 ...
>>    $ Height    : int  1 1 1 1 1 2 2 2 2 2 ...
>>    $ Calls     : int  6 444 236 12 143 5 815 712 30 142 ...
>>    $ f.Night   : Factor w/ 12 levels "1","2","3","4",..: 1 2 3 4 5 1 2 3
>> 4 5 ...
>>    $ f.Site    : Factor w/ 4 levels "1","2","3","4": 1 1 1 1 1 1 1 1 1 1 ...
>>    $ f.Detector: Factor w/ 6 levels "1","2","3","4",..: 1 1 1 1 1 2 2 2 2
>> 2 ...
>>    $ f.Height  : Factor w/ 2 levels "1","2": 1 1 1 1 1 2 2 2 2 2 ...
>    By the way, you said you have three sites, but the data have four
> levels for f.Site?  Did you drop one site from the data and not
> use droplevels() ?
I do have four sites, but only include three for some of my analysis. Sorry.
>
>> I then coded for the nested variables:
>> data$detector <- with(data, factor(f.Site:f.Detector))
>> data$night <- with(data, factor(f.Site:f.Night))
>>
>> trans.log <- log(data$Calls+1)
>>
>> model <- glmer(round(trans.log,digits=0)~ f.Height + (1|night) +
>> (1|detector) +
>>       (1|f.Site) , data = data, family=poisson)
>>
>> I am uncertain on a couple things. Are my nested variables correct? Can
>> I correct for overdispersion with a transformation?
>>
>> I was also wondering if there is a reference explaining why there is no
>> residual variance term for the Poisson distribution. I saw the
>> explanation on a forum, but was hoping there was something I could cite.
>>
>> Any help or advice would be appreciated.
>> Thank you!
>> Amanda
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
I applied the individual-level random effect, but how do I interpret the 
proportion of variation from each factor once it is included?

 > model <- glmer(Calls ~ f.Height + f.Site + (1|f.Site/f.Night) +
+ (1|f.Site:f.Detector), data = data, family=poisson)
 >
 > data$ID <- 1:nrow(data)
 > model1 <- glmer(Calls ~ f.Height + f.Site + (1|f.Night/f.Site) + 
(1|f.Site:f.Detector)
+ + (1|ID), data = data, family = poisson)
Number of levels of a grouping factor for the random effects
is *equal* to n, the number of observations
 >
 > anova(model, model1)
Data: data
Models:
model: Calls ~ f.Height + f.Site + (1 | f.Site/f.Night) + (1 | 
f.Site:f.Detector)
model1: Calls ~ f.Height + f.Site + (1 | f.Night/f.Site) + (1 | 
f.Site:f.Detector) +
model1:     (1 | ID)
        Df   AIC   BIC   logLik Chisq Chi Df Pr(>Chisq)
model   8 49163 49191 -24573.4
model1  9  1615  1647   -798.6 47550      1  < 2.2e-16 ***

 > model1
Generalized linear mixed model fit by the Laplace approximation
Formula: Calls ~ f.Height + f.Site + (1 | f.Night/f.Site) + 
(1|f.Site:f.Detector) + (1 | ID)
    Data: data
   AIC  BIC logLik deviance
  1615 1647 -798.6     1597
Random effects:
  Groups            Name        Variance Std.Dev.
  ID                (Intercept) 1.07827  1.03840
  f.Site:f.Night    (Intercept) 1.90958  1.38187
  f.Site:f.Detector (Intercept) 2.32948  1.52626
  f.Night           (Intercept) 0.65313  0.80817
Number of obs: 249, groups: ID, 249; f.Site:f.Night, 47; 
f.Site:f.Detector, 24; f.Night, 12

Fixed effects:
             Estimate Std. Error z value Pr(>|z|)
(Intercept)  2.59535    0.86051   3.016 0.002561 **
f.Height2   -0.05362    0.64015  -0.084 0.933245
f.Site2      1.01975    1.07455   0.949 0.342619
f.Site3      0.73546    1.08115   0.680 0.496343
f.Site4      4.15381    1.07196   3.875 0.000107 ***

Does this mean: Site has a significant effect on bat activity and
44% of the variation in bat activity levels can be explained by detector 
placement within sites
36% by an interaction between Site and Night
12% by temporal effects (night)
20% by individual variation
Does the individual variation essentially mean the variation from not 
explained by temporal and spatial effects?



More information about the R-sig-mixed-models mailing list