[R-sig-ME] A sledgehammer to crack a nut?

Wed Sep 14 14:21:26 CEST 2016

Hello,
thank you very much for your answers,

@ Thierry,

In the Model
glmer(response ~ E1 * E2 + E3 + (1|R1), data, family = poisson)

can I say "I analyze E3 as a main effect"?.

Let's assume the following effect: For habitat "wood" the numbers of pupae
were significantly decreased when exposed to blue tits, but when i look at
the data i see, that the effect was very strong in the first year, but not
visible at all in the second year. Could i be deceived by the analysis
above, in the way that it says: There was a strong difference between
exposed/not exposed for wood and overall the numbers of pupae where
similar in both years?
Or is it simply not necessary that the analysis tells me about that, the
assumed "strong" effect would be predicted as "moderate" and it is up to
me to lay it on the line and present figures to tell the audience that
there is a also 50/50 chance to have an effect at all?

Wuuld that be the right way to perform a Likelihood ratio test for the
above analysis?

glmm1 <- glmer(response ~ E1 * E2 + E3 + (1|R1), data, family = poisson)
glmm2 <- glmer(response ~ E1 + E2 + E3 + (1|R1), data, family = poisson)
glmm3 <- glmer(response ~ E1 + E2 + (1|R1), data, family = poisson)
glmm4 <- glmer(response ~ E2 + E3 + (1|R1), data, family = poisson)
glmm5 <- glmer(response ~ E1 + E3 + (1|R1), data, family = poisson)
glmm0 <- glmer(response ~ 1 + (1|R1), data, family = poisson)

anova(glmm1, glmm2, glmm3, glmm4, glmm5, glmm0) ?

I don't find a tutorial on "how to perfom a likelihood ratio test by hand"
and the afex package does not work on my computer.

@John
That was very interesting, repeated measures ANOVA with only two rep.
measurements "devolves" into a non-repeated measures analysis.
And combining the values of both measurements followed by non-repmes ANOVA
could solve the problem.
The first part of your answer focuses experiments with only two repeated
measures, when these measures where taken within the same experimental
unit, right?. Then you explain methods how to reduce both measurements to
get a single value as outcome variable. Would taking the average of both
repeated measurements be an option, too?
I did not understand the difference between
b) change = group + pre
and
d) post = group + pre

since when group is defined as a) change (post-pre) = group, then group +
pre = post - pre + pre = post  (= change in b) )

Next, addition of 0,1,2 to the measurement value in value = group + time,
wouldn't that mean to add two values of different units, i.e. counts and
hours? Or is 0,1,2 without a unit?

Thank you very much

Kind regards,
Quentin

> Dear Quentin,
>
> Since your response variable contains counts, you can't use ANOVA which
> assumes residuals with a Gaussian distribution.
>
> Year is conceptually a random effect. But with only two levels you get
> into
> numerical problem. Hence it is better to add it to the fixed effects.
>
> So I'd go for
>
> glmer(response ~ E1 * E2 + E3 + (1|R1), data, family = poisson)
> glmer.nb(response ~ E1 * E2 + E3 + (1|R1), data)
>
> Note that  E1 * E2 * E3 if much more complex than  E1 * E2 + (1|E3) in
> terms of model fit.
>
> Best regards,
>
>
> ir. Thierry Onkelinx
> Instituut voor natuur- en bosonderzoek / Research Institute for Nature and
> Forest
> team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
> Kliniekstraat 25
> 1070 Anderlecht
> Belgium
>
> To call in the statistician after the experiment is done may be no more
> than asking him to perform a post-mortem examination: he may be able to
> say
> what the experiment died of. ~ Sir Ronald Aylmer Fisher
> The plural of anecdote is not data. ~ Roger Brinner
> The combination of some data and an aching desire for an answer does not
> ensure that a reasonable answer can be extracted from a given body of
> data.
> ~ John Tukey
>
> 2016-09-13 16:00 GMT+02:00 John Sorkin <jsorkin at grecc.umaryland.edu>:
>
>> Quentin,
>>
>> A general comment.
>>
>> Accounting for repeated measures taken from the same observational unit
>> is needed only when three or more measurements
>> have been obtained. When there are only two measurements one can either
>> model change (i.e. post-pre) or post alone without
>> any use of repeated measures theory or software. In fact, if one uses
>> repeated measures ANOVA when only two measurements,
>> the analysis "devolves" into a non-repeated measures analysis. When we
>> wish to model two measurements the model can be
>> specified in many ways including:
>> change (post-pre) = group
>> change =group + pre
>> post = group (this should be used we care as it assumes that the pre
>> value is the same in all experimental groups)
>> post = group + pre
>>
>> You will note that all the models listed above have at most single value
>> of the outcome of interest on the right side
>> of the equals sign, further there is no indication of time the
>> observation was obtained on the right side of the equals
>> sign. If you need to have two or more values of the outcome of interest
>> on the right side of the equals sign, and thus
>> need a variable to indicate the time at which the observation was
>> obtained, you need to use repeated measures techniques
>> and repeated measures analyses. For example if there are three
>> measurements obtained from each observational unit,
>> you would need a model something like the following:
>> value = group + time, where time might equal 0,  1, and 2.
>>
>> John
>>
>>
>> John David Sorkin M.D., Ph.D.
>> Professor of Medicine
>> Chief, Biostatistics and Informatics
>> University of Maryland School of Medicine Division of Gerontology and
>> Geriatric Medicine
>> Baltimore VA Medical Center
>> 10 North Greene Street
>> GRECC (BT/18/GR)
>> Baltimore, MD 21201-1524
>> (Phone) 410-605-7119
>> (Fax) 410-605-7913 (Please call phone number above prior to faxing)
>> >>> "Quentin Schorpp" <quentin.schorpp at thuenen.de> 09/13/16 9:18 AM >>>
>> Hello,
>>
>> I have trouble with the term "repeated measurements" since I started to
>> use statistics. During my time as a scientist I never saw an experiment
>> where time-repeated measurements are NOT involved. Normally there are
>> either before/after measurements, time-rows to investigate a development
>> of a measurement variable or the repetition of a certain investigation
>> in
>> consecutive years. Therefore I'm already wondering why most people start
>> learning basic statistics and repeated measurement is always declared as
>> the "hard stuff" for self training in the future.
>>
>> Now I?ve got data from an experiment repeatedly conducted in 2
>> consecutive
>> years.
>>
>> The measurements are from trees, there are five trees exposed/not
>> exposed
>> at each habitat (5x3x3) = 30 trees. From each tree three samples were
>> taken (i.e. n=3 pseudoreplicates). Considering the repetition in the two
>> years there are n=6 pseudoreplicates, right? And total n = 6 x 30 = 180
>> Summary: 10 Trees at three habitats either exposed or not exposed to
>> blue
>> tits. Each tree was measured three times. The whole experiment was
>> repeated two times. Balanced sample design.
>>
>> The response variable is count data (of larvae and pupae of a moth)
>> The explanatory variables are: E1) exposition to blue tits (factor,
>> yes/no); E2) the type of habitat (wood, farmland, urban) and E3) the
>> year
>> of conduction.
>>
>> The random variables are R1) the Tree (factor, ID 1-30) [and R2) the
>> year
>> of conduction]
>>
>> In my opinion, a quite simple study design. Now, I am interested in (all
>> the possible ways of) analysis of the following Hypotheses:
>> H1 = blue tits reduce the number of larvae on the trees
>> H0 = There are no differences in the number of pupae/larvae either
>> exposed
>> to blue tits or not
>> Additionally I am interested in the influence of Habitat type on H1 and
>> H0
>>
>> I learned that the best way to solve problems with repeated measurements
>> is to use mixed effects models.
>>
>> My model:
>> lmer(response ~ E1 * E2 + (1|E3) + (1|R1), data)
>> and if I?m interested in differences according to the years:
>> lmer(response ~ E1 * E2 * E3 + (1|R1), data)
>>
>> Questions:
>> is that right or do i is it better to use two ANOVAs for each
>> consecutive
>> year and the means
>> for
>> the trees, just because everybody can understand it?
>> What would be the analysis of choice if the residuals are not normally
>> distributed or heteroscedastic? Or: do non-parameteric tests do not need
>> to consider random effects?
>>
>> Kind regards,
>> Quentin
>>
>> _______________________________________________
>> R-sig-mixed-models at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>> Confidentiality Statement:
>> This email message, including any attachments, is for the sole use of
>> the intended recipient(s) and may contain confidential and privileged
>> information. Any unauthorized use, disclosure or distribution is
>> prohibited. If you are not the intended recipient, please contact the
>> sender by reply email and destroy all copies of the original message.
>> _______________________________________________
>> R-sig-mixed-models at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>
>