[R-sig-ME] multiple nested random factors
Amanda Adams
aadams26 at uwo.ca
Fri Feb 22 18:23:24 CET 2013
Thank you for the response Dr. Bolker.
On 22/02/2013 9:05 AM, Ben Bolker wrote:
> Amanda Adams <aadams26 at ...> writes:
>
>> I have been having a heck of a time figuring out how to estimate the
>> proportion of variance from several random factors. I have a count data
>> of the number of bat calls recorded at 3 sites, on 6 detectors, over 12
>> nights. Detectors were at 2 heights.
>> If I understand nested factors correctly, Detectors are nested in Site
>> and Night is nested in Site.
>> Site/Detector and Site/Night are random
>> factors and Height is a fixed factor.
> It's still not entirely clear to me from this description how
> your data are structured. You have an average of about 249/12 ~ 21
> observations per night, so I'm going to assume you have 6 detectors
> *at each site*. Detector will be nested in site (because it doesn't
> make any sense to analyze what happens at "detector number 1" unless
> the detectors are somehow arranged so that the set of (d1:site1,
> d1:site2, d1:site3, ... has something in common). You *may* want
> a night:site interaction (if you have enough data), but in principle
> you also want a site factor (probably fixed, since there are only
> three levels) and a night factor. This would be
>
> ~ height + f.Site + (1|f.Night/f.Site) + (1|f.Site:f.Detector)
>
> It is quite likely that you will find some of these variance
> components estimated as zero ...
>
Yes, I have 6 detectors at each site.
>
>> Also, data is overdispersed so I am transforming number of calls as
>> log(Calls+1).
> This makes no sense (sorry). Poisson models must have a response
> variable that is a raw count value (integer). How do you know the
> data are overdispersed before you fit a model ??? (Although I do see
> that you have widely varying values in your 'Calls' variable, so
> you may be right ...)
>
> For various ways of handling overdispersion in GLMMs see
> http://glmm.wikidot.com/faq
I had tested for overdispersion with qcc.overdispersion.test in qcc
package. I had tried using an individual-level random effect to capture
overdispersion, but was not sure how to interpret the data once that was
included.
> I don't know if it's helpful, but Bolker et al. 2009 _Trends
> in Ecology and Evolution_ might be a citeable source for GLMMs.
> It doesn't really say anything specific about Poisson variables
> and why a Poisson model doesn't include a residual variance; for
> that you should probably cite (after reading!) a basic book
> on generalized linear models.
This paper has been very helpful and was the reason I was initially
using glmer. Thanks! I will do some more reading.
>
>> 'data.frame': 249 obs. of 11 variables:
>> $ Night : int 1 3 5 11 12 1 3 5 11 12 ...
>> $ Night2 : int 1 2 3 4 5 1 2 3 4 5 ...
>> $ Site : int 1 1 1 1 1 1 1 1 1 1 ...
>> $ Species : int 1 1 1 1 1 1 1 1 1 1 ...
>> $ Detector : int 1 1 1 1 1 2 2 2 2 2 ...
>> $ Height : int 1 1 1 1 1 2 2 2 2 2 ...
>> $ Calls : int 6 444 236 12 143 5 815 712 30 142 ...
>> $ f.Night : Factor w/ 12 levels "1","2","3","4",..: 1 2 3 4 5 1 2 3
>> 4 5 ...
>> $ f.Site : Factor w/ 4 levels "1","2","3","4": 1 1 1 1 1 1 1 1 1 1 ...
>> $ f.Detector: Factor w/ 6 levels "1","2","3","4",..: 1 1 1 1 1 2 2 2 2
>> 2 ...
>> $ f.Height : Factor w/ 2 levels "1","2": 1 1 1 1 1 2 2 2 2 2 ...
> By the way, you said you have three sites, but the data have four
> levels for f.Site? Did you drop one site from the data and not
> use droplevels() ?
I do have four sites, but only include three for some of my analysis. Sorry.
>
>> I then coded for the nested variables:
>> data$detector <- with(data, factor(f.Site:f.Detector))
>> data$night <- with(data, factor(f.Site:f.Night))
>>
>> trans.log <- log(data$Calls+1)
>>
>> model <- glmer(round(trans.log,digits=0)~ f.Height + (1|night) +
>> (1|detector) +
>> (1|f.Site) , data = data, family=poisson)
>>
>> I am uncertain on a couple things. Are my nested variables correct? Can
>> I correct for overdispersion with a transformation?
>>
>> I was also wondering if there is a reference explaining why there is no
>> residual variance term for the Poisson distribution. I saw the
>> explanation on a forum, but was hoping there was something I could cite.
>>
>> Any help or advice would be appreciated.
>> Thank you!
>> Amanda
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
I applied the individual-level random effect, but how do I interpret the
proportion of variation from each factor once it is included?
> model <- glmer(Calls ~ f.Height + f.Site + (1|f.Site/f.Night) +
+ (1|f.Site:f.Detector), data = data, family=poisson)
>
> data$ID <- 1:nrow(data)
> model1 <- glmer(Calls ~ f.Height + f.Site + (1|f.Night/f.Site) +
(1|f.Site:f.Detector)
+ + (1|ID), data = data, family = poisson)
Number of levels of a grouping factor for the random effects
is *equal* to n, the number of observations
>
> anova(model, model1)
Data: data
Models:
model: Calls ~ f.Height + f.Site + (1 | f.Site/f.Night) + (1 |
f.Site:f.Detector)
model1: Calls ~ f.Height + f.Site + (1 | f.Night/f.Site) + (1 |
f.Site:f.Detector) +
model1: (1 | ID)
Df AIC BIC logLik Chisq Chi Df Pr(>Chisq)
model 8 49163 49191 -24573.4
model1 9 1615 1647 -798.6 47550 1 < 2.2e-16 ***
> model1
Generalized linear mixed model fit by the Laplace approximation
Formula: Calls ~ f.Height + f.Site + (1 | f.Night/f.Site) +
(1|f.Site:f.Detector) + (1 | ID)
Data: data
AIC BIC logLik deviance
1615 1647 -798.6 1597
Random effects:
Groups Name Variance Std.Dev.
ID (Intercept) 1.07827 1.03840
f.Site:f.Night (Intercept) 1.90958 1.38187
f.Site:f.Detector (Intercept) 2.32948 1.52626
f.Night (Intercept) 0.65313 0.80817
Number of obs: 249, groups: ID, 249; f.Site:f.Night, 47;
f.Site:f.Detector, 24; f.Night, 12
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 2.59535 0.86051 3.016 0.002561 **
f.Height2 -0.05362 0.64015 -0.084 0.933245
f.Site2 1.01975 1.07455 0.949 0.342619
f.Site3 0.73546 1.08115 0.680 0.496343
f.Site4 4.15381 1.07196 3.875 0.000107 ***
Does this mean: Site has a significant effect on bat activity and
44% of the variation in bat activity levels can be explained by detector
placement within sites
36% by an interaction between Site and Night
12% by temporal effects (night)
20% by individual variation
Does the individual variation essentially mean the variation from not
explained by temporal and spatial effects?
More information about the R-sig-mixed-models
mailing list