[R-sig-ME] A sledgehammer to crack a nut?

Thierry Onkelinx thierry.onkelinx at inbo.be
Thu Sep 15 11:02:52 CEST 2016


Dear Quentin,

I suggested to use E1 * E2 + E3 + (1|R1) because that allows the same model
fit as E1 * E2 + (1|E3) + (1|R1). If interaction between E3 and E1 or E2
make sense then you should add them.

Use LRT as

anova(glmm1, glmm2)
anova(glmm2, glmm3)

Best regards,

ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature and
Forest
team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
Kliniekstraat 25
1070 Anderlecht
Belgium

To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to say
what the experiment died of. ~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data. ~ Roger Brinner
The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of data.
~ John Tukey

2016-09-14 14:21 GMT+02:00 Quentin Schorpp <quentin.schorpp op thuenen.de>:

> Hello,
> thank you very much for your answers,
>
> @ Thierry,
>
> In the Model
> glmer(response ~ E1 * E2 + E3 + (1|R1), data, family = poisson)
>
> can I say "I analyze E3 as a main effect"?.
>
> Let's assume the following effect: For habitat "wood" the numbers of pupae
> were significantly decreased when exposed to blue tits, but when i look at
> the data i see, that the effect was very strong in the first year, but not
> visible at all in the second year. Could i be deceived by the analysis
> above, in the way that it says: There was a strong difference between
> exposed/not exposed for wood and overall the numbers of pupae where
> similar in both years?
> Or is it simply not necessary that the analysis tells me about that, the
> assumed "strong" effect would be predicted as "moderate" and it is up to
> me to lay it on the line and present figures to tell the audience that
> there is a also 50/50 chance to have an effect at all?
>
> Wuuld that be the right way to perform a Likelihood ratio test for the
> above analysis?
>
> glmm1 <- glmer(response ~ E1 * E2 + E3 + (1|R1), data, family = poisson)
> glmm2 <- glmer(response ~ E1 + E2 + E3 + (1|R1), data, family = poisson)
> glmm3 <- glmer(response ~ E1 + E2 + (1|R1), data, family = poisson)
> glmm4 <- glmer(response ~ E2 + E3 + (1|R1), data, family = poisson)
> glmm5 <- glmer(response ~ E1 + E3 + (1|R1), data, family = poisson)
> glmm0 <- glmer(response ~ 1 + (1|R1), data, family = poisson)
>
> anova(glmm1, glmm2, glmm3, glmm4, glmm5, glmm0) ?
>
> I don't find a tutorial on "how to perfom a likelihood ratio test by hand"
> and the afex package does not work on my computer.
>
> @John
> That was very interesting, repeated measures ANOVA with only two rep.
> measurements "devolves" into a non-repeated measures analysis.
> And combining the values of both measurements followed by non-repmes ANOVA
> could solve the problem.
> The first part of your answer focuses experiments with only two repeated
> measures, when these measures where taken within the same experimental
> unit, right?. Then you explain methods how to reduce both measurements to
> get a single value as outcome variable. Would taking the average of both
> repeated measurements be an option, too?
> I did not understand the difference between
> b) change = group + pre
> and
> d) post = group + pre
>
> since when group is defined as a) change (post-pre) = group, then group +
> pre = post - pre + pre = post  (= change in b) )
>
> Next, addition of 0,1,2 to the measurement value in value = group + time,
> wouldn't that mean to add two values of different units, i.e. counts and
> hours? Or is 0,1,2 without a unit?
>
> Thank you very much
>
> Kind regards,
> Quentin
>
> > Dear Quentin,
> >
> > Since your response variable contains counts, you can't use ANOVA which
> > assumes residuals with a Gaussian distribution.
> >
> > Year is conceptually a random effect. But with only two levels you get
> > into
> > numerical problem. Hence it is better to add it to the fixed effects.
> >
> > So I'd go for
> >
> > glmer(response ~ E1 * E2 + E3 + (1|R1), data, family = poisson)
> > glmer.nb(response ~ E1 * E2 + E3 + (1|R1), data)
> >
> > Note that  E1 * E2 * E3 if much more complex than  E1 * E2 + (1|E3) in
> > terms of model fit.
> >
> > Best regards,
> >
> >
> > ir. Thierry Onkelinx
> > Instituut voor natuur- en bosonderzoek / Research Institute for Nature
> and
> > Forest
> > team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
> > Kliniekstraat 25
> > 1070 Anderlecht
> > Belgium
> >
> > To call in the statistician after the experiment is done may be no more
> > than asking him to perform a post-mortem examination: he may be able to
> > say
> > what the experiment died of. ~ Sir Ronald Aylmer Fisher
> > The plural of anecdote is not data. ~ Roger Brinner
> > The combination of some data and an aching desire for an answer does not
> > ensure that a reasonable answer can be extracted from a given body of
> > data.
> > ~ John Tukey
> >
> > 2016-09-13 16:00 GMT+02:00 John Sorkin <jsorkin op grecc.umaryland.edu>:
> >
> >> Quentin,
> >>
> >> A general comment.
> >>
> >> Accounting for repeated measures taken from the same observational unit
> >> is needed only when three or more measurements
> >> have been obtained. When there are only two measurements one can either
> >> model change (i.e. post-pre) or post alone without
> >> any use of repeated measures theory or software. In fact, if one uses
> >> repeated measures ANOVA when only two measurements,
> >> the analysis "devolves" into a non-repeated measures analysis. When we
> >> wish to model two measurements the model can be
> >> specified in many ways including:
> >> change (post-pre) = group
> >> change =group + pre
> >> post = group (this should be used we care as it assumes that the pre
> >> value is the same in all experimental groups)
> >> post = group + pre
> >>
> >> You will note that all the models listed above have at most single value
> >> of the outcome of interest on the right side
> >> of the equals sign, further there is no indication of time the
> >> observation was obtained on the right side of the equals
> >> sign. If you need to have two or more values of the outcome of interest
> >> on the right side of the equals sign, and thus
> >> need a variable to indicate the time at which the observation was
> >> obtained, you need to use repeated measures techniques
> >> and repeated measures analyses. For example if there are three
> >> measurements obtained from each observational unit,
> >> you would need a model something like the following:
> >> value = group + time, where time might equal 0,  1, and 2.
> >>
> >> John
> >>
> >>
> >> John David Sorkin M.D., Ph.D.
> >> Professor of Medicine
> >> Chief, Biostatistics and Informatics
> >> University of Maryland School of Medicine Division of Gerontology and
> >> Geriatric Medicine
> >> Baltimore VA Medical Center
> >> 10 North Greene Street
> >> GRECC (BT/18/GR)
> >> Baltimore, MD 21201-1524
> >> (Phone) 410-605-7119
> >> (Fax) 410-605-7913 (Please call phone number above prior to faxing)
> >> >>> "Quentin Schorpp" <quentin.schorpp op thuenen.de> 09/13/16 9:18 AM >>>
> >> Hello,
> >>
> >> I have trouble with the term "repeated measurements" since I started to
> >> use statistics. During my time as a scientist I never saw an experiment
> >> where time-repeated measurements are NOT involved. Normally there are
> >> either before/after measurements, time-rows to investigate a development
> >> of a measurement variable or the repetition of a certain investigation
> >> in
> >> consecutive years. Therefore I'm already wondering why most people start
> >> learning basic statistics and repeated measurement is always declared as
> >> the "hard stuff" for self training in the future.
> >>
> >> Now I?ve got data from an experiment repeatedly conducted in 2
> >> consecutive
> >> years.
> >>
> >> The measurements are from trees, there are five trees exposed/not
> >> exposed
> >> at each habitat (5x3x3) = 30 trees. From each tree three samples were
> >> taken (i.e. n=3 pseudoreplicates). Considering the repetition in the two
> >> years there are n=6 pseudoreplicates, right? And total n = 6 x 30 = 180
> >> Summary: 10 Trees at three habitats either exposed or not exposed to
> >> blue
> >> tits. Each tree was measured three times. The whole experiment was
> >> repeated two times. Balanced sample design.
> >>
> >> The response variable is count data (of larvae and pupae of a moth)
> >> The explanatory variables are: E1) exposition to blue tits (factor,
> >> yes/no); E2) the type of habitat (wood, farmland, urban) and E3) the
> >> year
> >> of conduction.
> >>
> >> The random variables are R1) the Tree (factor, ID 1-30) [and R2) the
> >> year
> >> of conduction]
> >>
> >> In my opinion, a quite simple study design. Now, I am interested in (all
> >> the possible ways of) analysis of the following Hypotheses:
> >> H1 = blue tits reduce the number of larvae on the trees
> >> H0 = There are no differences in the number of pupae/larvae either
> >> exposed
> >> to blue tits or not
> >> Additionally I am interested in the influence of Habitat type on H1 and
> >> H0
> >>
> >> I learned that the best way to solve problems with repeated measurements
> >> is to use mixed effects models.
> >>
> >> My model:
> >> lmer(response ~ E1 * E2 + (1|E3) + (1|R1), data)
> >> and if I?m interested in differences according to the years:
> >> lmer(response ~ E1 * E2 * E3 + (1|R1), data)
> >>
> >> Questions:
> >> is that right or do i is it better to use two ANOVAs for each
> >> consecutive
> >> year and the means
> >> for
> >> the trees, just because everybody can understand it?
> >> What would be the analysis of choice if the residuals are not normally
> >> distributed or heteroscedastic? Or: do non-parameteric tests do not need
> >> to consider random effects?
> >>
> >> Kind regards,
> >> Quentin
> >>
> >> _______________________________________________
> >> R-sig-mixed-models op r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> >> Confidentiality Statement:
> >> This email message, including any attachments, is for the sole use of
> >> the intended recipient(s) and may contain confidential and privileged
> >> information. Any unauthorized use, disclosure or distribution is
> >> prohibited. If you are not the intended recipient, please contact the
> >> sender by reply email and destroy all copies of the original message.
> >> _______________________________________________
> >> R-sig-mixed-models op r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> >>
> >
>
>
>

	[[alternative HTML version deleted]]



More information about the R-sig-mixed-models mailing list