[R-sig-ME] lme4 glmer general help wanted - code included
Ben Gillespie
nebstah at hotmail.com
Thu Dec 6 22:23:23 CET 2012
Hi guys,
I'm very new to R and have been teaching myself over the past few months -
it's a great tool and I'm hoping to use it to analyse my PhD data.
As I'm a bit of a newb, I'd really appreciate any feedback and/or guidance
with regards to the following questions that relate to generalized linear
mixed modelling (or, at least, I think they do!)(if there is a 'better',
more appropriate way that I could attempt to answer my questions, please let me know).
I've spent a lot of time researching this approach on the internet, but can't
seem to find any directly applicable examples.
Thanks in advance, and, if you need any further information, please let me know.
# My experiment:
# I have 1 site on 3 different rivers (independent)(sites 1,2 and 3).
# I visit each site 2 times (time 1 and 2).
# On each visit, I take 5x replicate insect samples and calculate total abundance for each replicate.
# Site 1 is in an area called "yellow" and sites 2 and 3 are in an area called "blue".
# My data frame:
data=data.frame(
site=c(rep(1,10),rep(2,10),rep(3,10)),
replicate=c(rep(1:5,6)),
time=c(rep(1,5),rep(2,5),rep(1,5),rep(2,5),rep(1,5),rep(2,5)),
abundance=c(1,2,1,2,1,2,1,2,1,2,30,32,30,32,30,32,30,32,30,32,30,31,33,32,31,31,33,32,31,32),
sitetype=c(rep("yellow",10),rep("blue",20))
)
data$site=factor(data$site)
data$replicate=factor(data$replicate)
data$time=factor(data$time)
data
# Initial remarks:
# As each replicate (1-5) was taken from within each site (1-3) on both sampling times (1-2),
# I figure that 'replicate' should be treated as nested within 'site' and that both should be treated as random factors?
# First question: Is there is difference in abundance between sites?
# Second question: Is there is difference in abundance between sitetypes (blue or yellow)?
#If my 'initial remarks' statement is correct (please tell me if not), then I think a generalized linear mixed model is appropriate and would be something along these lines:
# Fitting the model:
require(lme4)
glmm1=glmer(abundance~time+sitetype+(1|site/replicate),family="poisson",data=data) #I chose to use poisson as abundance is count data... is this recommended?
summary(glmm1)
#Output:
################################################################
Generalized linear mixed model fit by the Laplace approximation
Formula: abundance ~ time + sitetype + (1 | site/replicate)
Data: data
AIC BIC logLik deviance
12.31 19.31 -1.153 2.306
Random effects:
Groups Name Variance Std.Dev.
replicate:site (Intercept) 0 0
site (Intercept) 0 0
Number of obs: 30, groups: replicate:site, 15; site, 3
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 3.43579 0.05641 60.91 <2e-16 ***
time2 0.01560 0.07900 0.20 0.843
sitetypeyellow -3.03815 0.26127 -11.63 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Correlation of Fixed Effects:
(Intr) time2
time2 -0.706
sitetypyllw -0.108 0.000
################################################################
# Inferences:
#I'm unsure how to assess the variance and std dev scores for site... some guidance here would be appreciated....i.e. how do I answer my original question: Is there is difference in abundance between sites?
#There is no statistically significant difference between the two time periods (P=>0.05)
#Using the above output, the model suggests that there is a statistically significant difference between site types (p<0.05)
# Further questions:
#1 Are the above inferences correct?
#2 I have read about overdispersion.... how would I test for this in this example?
#3 How could I build an interaction term into the model and answer the following: "Is there a statistically significant site*time interaction?"
#4 Finally, are there any obvious steps or things I should be doing in order to get a 'robust' or 'correct' answer from this problem? i.e. further tests... alternative models and comparisons...
Thanks again,
Rob
More information about the R-sig-mixed-models
mailing list