[R-sig-ME] lme4 glmer general help wanted - code included

Fri Dec 7 14:34:17 CET 2012

Hi Thierry,

One final question:

In my actual dataset, length(coef(model)) = 16 rather than 4 as in the example below.

When applying the glht argument, do you know how best to specify the linfct matrix argument if I just wanted to test my 'site' related hypothesis: is there a difference in abundance between sites?

Many thanks,

----------------------------------------
> From: nebstah at hotmail.com
> To: thierry.onkelinx at inbo.be; r-sig-mixed-models at r-project.org
> Subject: RE: [R-sig-ME] lme4 glmer general help wanted - code included
> Date: Fri, 7 Dec 2012 12:45:02 +0000
>
> Thanks again.
>
> No the data were not real; I simplified them for the list.
>
> Thanks for your assistance,
>
> ----------------------------------------
> > From: Thierry.ONKELINX at inbo.be
> > To: nebstah at hotmail.com; r-sig-mixed-models at r-project.org
> > Subject: RE: [R-sig-ME] lme4 glmer general help wanted - code included
> > Date: Fri, 7 Dec 2012 11:21:51 +0000
> >
> > Have a look at http://glmm.wikidot.com/faq and look for (should I treat factor xxx as fixed or random).
> >
> > Did you post your real data? Is so there is hardly any difference among the replicates. In that case I would aggregate the replicates prior to analysis.
> >
> > ir. Thierry Onkelinx
> > Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest
> > team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
> > Kliniekstraat 25
> > 1070 Anderlecht
> > Belgium
> > + 32 2 525 02 51
> > + 32 54 43 61 85
> > Thierry.Onkelinx at inbo.be
> > www.inbo.be
> >
> > To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of.
> > ~ Sir Ronald Aylmer Fisher
> >
> > The plural of anecdote is not data.
> > ~ Roger Brinner
> >
> > The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data.
> > ~ John Tukey
> >
> >
> > -----Oorspronkelijk bericht-----
> > Van: Rob Bodin [mailto:nebstah at hotmail.com]
> > Verzonden: vrijdag 7 december 2012 11:07
> > Aan: ONKELINX, Thierry; r-sig-mixed-models at r-project.org
> > Onderwerp: RE: [R-sig-ME] lme4 glmer general help wanted - code included
> >
> > Thanks Thierry, that's very useful.
> >
> > Just to clarify, I was under the impression that as my sites were sampled through time (and thus individual samples were not independent of each other), I would need to treat them as random factors - is this not the case?
> >
> > Thanks again,
> >
> > ----------------------------------------
> > > From: Thierry.ONKELINX at inbo.be
> > > To: nebstah at hotmail.com; r-sig-mixed-models at r-project.org
> > > Subject: RE: [R-sig-ME] lme4 glmer general help wanted - code included
> > > Date: Fri, 7 Dec 2012 09:49:14 +0000
> > >
> > > Dear Ben,
> > >
> > > Your dataset is to small for a mixed model. Just use a plain glm and glht() to test your hypotheses.
> > >
> > > model <- glm(abundance ~ site + time, data = data, family =
> > > quasipoisson)
> > > summary(model)
> > >
> > > library(multcomp)
> > > K <- rbind(
> > > SiteType = c(0, 0.5, 0.5, 0),
> > > Site23 = c(0, 1, -1, 0)
> > > )
> > > summary(glht(model, K))
> > >
> > > Furthermore I would recommend that you do some reading (e.g. Mixed Effects Models and Extensions in Ecology with R by Zuur et al 2009) and get some local statistical advise.
> > >
> > > Best regards,
> > >
> > > ir. Thierry Onkelinx
> > > Instituut voor natuur- en bosonderzoek / Research Institute for Nature
> > > and Forest team Biometrie & Kwaliteitszorg / team Biometrics & Quality
> > > Assurance Kliniekstraat 25
> > > 1070 Anderlecht
> > > Belgium
> > > + 32 2 525 02 51
> > > + 32 54 43 61 85
> > > Thierry.Onkelinx at inbo.be
> > > www.inbo.be
> > >
> > > To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of.
> > > ~ Sir Ronald Aylmer Fisher
> > >
> > > The plural of anecdote is not data.
> > > ~ Roger Brinner
> > >
> > > The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data.
> > > ~ John Tukey
> > >
> > > -----Oorspronkelijk bericht-----
> > > Van: r-sig-mixed-models-bounces at r-project.org
> > > [mailto:r-sig-mixed-models-bounces at r-project.org] Namens Ben Gillespie
> > > Verzonden: donderdag 6 december 2012 22:23
> > > Aan: r-sig-mixed-models at r-project.org
> > > Onderwerp: [R-sig-ME] lme4 glmer general help wanted - code included
> > >
> > > Hi guys,
> > >
> > > I'm very new to R and have been teaching myself over the past few months - it's a great tool and I'm hoping to use it to analyse my PhD data.
> > > As I'm a bit of a newb, I'd really appreciate any feedback and/or guidance with regards to the following questions that relate to generalized linear mixed modelling (or, at least, I think they do!)(if there is a 'better', more appropriate way that I could attempt to answer my questions, please let me know).
> > >
> > > I've spent a lot of time researching this approach on the internet, but can't seem to find any directly applicable examples.
> > >
> > > Thanks in advance, and, if you need any further information, please let me know.
> > >
> > > # My experiment:
> > > # I have 1 site on 3 different rivers (independent)(sites 1,2 and 3). # I visit each site 2 times (time 1 and 2). # On each visit, I take 5x replicate insect samples and calculate total abundance for each replicate.
> > > # Site 1 is in an area called "yellow" and sites 2 and 3 are in an area called "blue".
> > >
> > > # My data frame:
> > >
> > >
> > > data=data.frame(
> > > site=c(rep(1,10),rep(2,10),rep(3,10)),
> > > replicate=c(rep(1:5,6)),
> > > time=c(rep(1,5),rep(2,5),rep(1,5),rep(2,5),rep(1,5),rep(2,5)),
> > > abundance=c(1,2,1,2,1,2,1,2,1,2,30,32,30,32,30,32,30,32,30,32,30,31,33
> > > ,32,31,31,33,32,31,32),
> > > sitetype=c(rep("yellow",10),rep("blue",20))
> > > )
> > >
> > > data$site=factor(data$site)
> > > data$replicate=factor(data$replicate)
> > > data$time=factor(data$time)
> > >
> > > data
> > >
> > >
> > > # Initial remarks:
> > > # As each replicate (1-5) was taken from within each site (1-3) on both sampling times (1-2), # I figure that 'replicate' should be treated as nested within 'site' and that both should be treated as random factors?
> > >
> > > # First question: Is there is difference in abundance between sites?
> > > # Second question: Is there is difference in abundance between sitetypes (blue or yellow)?
> > >
> > > #If my 'initial remarks' statement is correct (please tell me if not), then I think a generalized linear mixed model is appropriate and would be something along these lines:
> > >
> > > # Fitting the model:
> > >
> > > require(lme4)
> > > glmm1=glmer(abundance~time+sitetype+(1|site/replicate),family="poisson",data=data) #I chose to use poisson as abundance is count data... is this recommended?
> > > summary(glmm1)
> > > #Output:
> > >
> > > ################################################################
> > > Generalized linear mixed model fit by the Laplace approximation
> > > Formula: abundance ~ time + sitetype + (1 | site/replicate)
> > > Data: data
> > > AIC BIC logLik deviance
> > > 12.31 19.31 -1.153 2.306
> > > Random effects:
> > > Groups Name Variance Std.Dev.
> > > replicate:site (Intercept) 0 0
> > > site (Intercept) 0 0 Number of obs: 30, groups: replicate:site, 15;
> > > site, 3
> > >
> > > Fixed effects:
> > > Estimate Std. Error z value Pr(>|z|)
> > > (Intercept) 3.43579 0.05641 60.91 <2e-16 ***
> > > time2 0.01560 0.07900 0.20 0.843
> > > sitetypeyellow -3.03815 0.26127 -11.63 <2e-16 ***
> > > ---
> > > Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
> > >
> > > Correlation of Fixed Effects:
> > > (Intr) time2
> > > time2 -0.706
> > > sitetypyllw -0.108 0.000
> > > ################################################################
> > >
> > > # Inferences:
> > >
> > > #I'm unsure how to assess the variance and std dev scores for site... some guidance here would be appreciated....i.e. how do I answer my original question: Is there is difference in abundance between sites?
> > > #There is no statistically significant difference between the two time
> > > periods (P=>0.05) #Using the above output, the model suggests that
> > > there is a statistically significant difference between site types
> > > (p<0.05)
> > >
> > > # Further questions:
> > >
> > > #1 Are the above inferences correct?
> > > #2 I have read about overdispersion.... how would I test for this in this example?
> > > #3 How could I build an interaction term into the model and answer the following: "Is there a statistically significant site*time interaction?"
> > > #4 Finally, are there any obvious steps or things I should be doing in order to get a 'robust' or 'correct' answer from this problem? i.e. further tests... alternative models and comparisons...
> > >
> > >
> > > Thanks again,
> > >
> > > Rob
> > > _______________________________________________
> > > R-sig-mixed-models at r-project.org mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> > > * * * * * * * * * * * * * D I S C L A I M E R * * * * * * * * * * * *
> > > * Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is door een geldig ondertekend document.
> > > The views expressed in this message and any annex are purely those of the writer and may not be regarded as stating an official position of INBO, as long as the message is not confirmed by a duly signed document.
> > * * * * * * * * * * * * * D I S C L A I M E R * * * * * * * * * * * * *
> > Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is door een geldig ondertekend document.
> > The views expressed in this message and any annex are purely those of the writer and may not be regarded as stating an official position of INBO, as long as the message is not confirmed by a duly signed document.