[R-sig-ME] within-group averaging

Jarrod Hadfield j.hadfield at ed.ac.uk
Wed Nov 28 18:53:44 CET 2012


Hi,

This is not really a specific question for mixed models but I was  
hoping someone might know the answer anyway.

To make things simple, imagine you have a chain of causality x->y->z  
with some error at each step:  x ~ N(0,1),  y ~ N(x,1) and z ~ N(y, 1).

Observations are made on individuals who are grouped by (this is the  
important bit) the intervals their y values fall between.

z is observed for each individual in a group. Although all x values  
are observed it is not possible to say which individual within a group  
the values belong too. Therefore, xbar is a vector the same length as  
x where each individual has the x value of its group mean

We also have a treatment which does not have a causal effect on x, y  
or z, but is associated with extreme values of x.

Both  lm(z~x+treatment) and lm(z~xbar+treatment) give an average  
treatment effect of zero and uniform p-values as expected.

However, imagine for individuals in the treated group that x values  
can be assigned such that xbar2 takes on values of xbar for the  
non-treated individuals and x for the treated individuals. In this  
case  lm(z~xbar2+treatment) provides strong evidence for a treatment  
effect!

I had an idea why this would be the case (based on differences in  
variances between xbar and x). However, the problem completely  
disappears if the groups are defined by which interval of x they occur  
in, rather than which interval of y, yet differences in variances  
between xbar and x persist under this scenario.

Some code is below. If anyone has any ideas what this type of problem  
is called, why it occurs and if there are known solutions I would be  
very glad to know.

Cheers,

Jarrod

x<-rnorm(100)
y<-rnorm(100, x)
z<-rnorm(100, y)

treatment<-rbinom(100,1, plogis(x-2))

cuty<-cut(y,10) # get 10 groups defined by y

xbar<-tapply(x, cuty, mean)[cuty]

xbar2<-xbar
xbar2[which(treatment==1)]<-x[which(treatment==1)]

summary(lm(z~x+treatment))
summary(lm(z~xbar+treatment))
summary(lm(z~xbar2+treatment))

# the treatment effect in model 3 is consistently negative and has  
high type I error.








-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.



More information about the R-sig-mixed-models mailing list