[R-sig-ME] within-group averaging
j.hadfield at ed.ac.uk
Wed Nov 28 18:53:44 CET 2012
This is not really a specific question for mixed models but I was
hoping someone might know the answer anyway.
To make things simple, imagine you have a chain of causality x->y->z
with some error at each step: x ~ N(0,1), y ~ N(x,1) and z ~ N(y, 1).
Observations are made on individuals who are grouped by (this is the
important bit) the intervals their y values fall between.
z is observed for each individual in a group. Although all x values
are observed it is not possible to say which individual within a group
the values belong too. Therefore, xbar is a vector the same length as
x where each individual has the x value of its group mean
We also have a treatment which does not have a causal effect on x, y
or z, but is associated with extreme values of x.
Both lm(z~x+treatment) and lm(z~xbar+treatment) give an average
treatment effect of zero and uniform p-values as expected.
However, imagine for individuals in the treated group that x values
can be assigned such that xbar2 takes on values of xbar for the
non-treated individuals and x for the treated individuals. In this
case lm(z~xbar2+treatment) provides strong evidence for a treatment
I had an idea why this would be the case (based on differences in
variances between xbar and x). However, the problem completely
disappears if the groups are defined by which interval of x they occur
in, rather than which interval of y, yet differences in variances
between xbar and x persist under this scenario.
Some code is below. If anyone has any ideas what this type of problem
is called, why it occurs and if there are known solutions I would be
very glad to know.
cuty<-cut(y,10) # get 10 groups defined by y
xbar<-tapply(x, cuty, mean)[cuty]
# the treatment effect in model 3 is consistently negative and has
high type I error.
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.
More information about the R-sig-mixed-models