[R-sig-ME] A sledgehammer to crack a nut?

Tue Sep 13 16:00:48 CEST 2016

Quentin,

A general comment.

Accounting for repeated measures taken from the same observational unit
is needed only when three or more measurements
have been obtained. When there are only two measurements one can either
model change (i.e. post-pre) or post alone without
any use of repeated measures theory or software. In fact, if one uses
repeated measures ANOVA when only two measurements, 
the analysis "devolves" into a non-repeated measures analysis. When we
wish to model two measurements the model can be  
specified in many ways including:
change (post-pre) = group
change =group + pre
post = group (this should be used we care as it assumes that the pre
value is the same in all experimental groups)
post = group + pre 

You will note that all the models listed above have at most single value
of the outcome of interest on the right side 
of the equals sign, further there is no indication of time the
observation was obtained on the right side of the equals
sign. If you need to have two or more values of the outcome of interest
on the right side of the equals sign, and thus
need a variable to indicate the time at which the observation was
obtained, you need to use repeated measures techniques
and repeated measures analyses. For example if there are three
measurements obtained from each observational unit,
you would need a model something like the following:
value = group + time, where time might equal 0,  1, and 2.

John 

John David Sorkin M.D., Ph.D.
Professor of Medicine
Chief, Biostatistics and Informatics
University of Maryland School of Medicine Division of Gerontology and
Geriatric Medicine
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to faxing) 
>>> "Quentin Schorpp" <quentin.schorpp at thuenen.de> 09/13/16 9:18 AM >>>
Hello,

I have trouble with the term "repeated measurements" since I started to
use statistics. During my time as a scientist I never saw an experiment
where time-repeated measurements are NOT involved. Normally there are
either before/after measurements, time-rows to investigate a development
of a measurement variable or the repetition of a certain investigation
in
consecutive years. Therefore I'm already wondering why most people start
learning basic statistics and repeated measurement is always declared as
the "hard stuff" for self training in the future.

Now I’ve got data from an experiment repeatedly conducted in 2
consecutive
years.

The measurements are from trees, there are five trees exposed/not
exposed
at each habitat (5x3x3) = 30 trees. From each tree three samples were
taken (i.e. n=3 pseudoreplicates). Considering the repetition in the two
years there are n=6 pseudoreplicates, right? And total n = 6 x 30 = 180
Summary: 10 Trees at three habitats either exposed or not exposed to
blue
tits. Each tree was measured three times. The whole experiment was
repeated two times. Balanced sample design.

The response variable is count data (of larvae and pupae of a moth)
The explanatory variables are: E1) exposition to blue tits (factor,
yes/no); E2) the type of habitat (wood, farmland, urban) and E3) the
year
of conduction.

The random variables are R1) the Tree (factor, ID 1-30) [and R2) the
year
of conduction]

In my opinion, a quite simple study design. Now, I am interested in (all
the possible ways of) analysis of the following Hypotheses:
H1 = blue tits reduce the number of larvae on the trees
H0 = There are no differences in the number of pupae/larvae either
exposed
to blue tits or not
Additionally I am interested in the influence of Habitat type on H1 and
H0

I learned that the best way to solve problems with repeated measurements
is to use mixed effects models.

My model:
lmer(response ~ E1 * E2 + (1|E3) + (1|R1), data)
and if I’m interested in differences according to the years:
lmer(response ~ E1 * E2 * E3 + (1|R1), data)

Questions:
is that right or do i is it better to use two ANOVAs for each consecutive year and the means
for
the trees, just because everybody can understand it?
What would be the analysis of choice if the residuals are not normally
distributed or heteroscedastic? Or: do non-parameteric tests do not need
to consider random effects?

Kind regards,
Quentin

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
Confidentiality Statement:
This email message, including any attachments, is for the sole use of
the intended recipient(s) and may contain confidential and privileged
information. Any unauthorized use, disclosure or distribution is
prohibited. If you are not the intended recipient, please contact the
sender by reply email and destroy all copies of the original message.