[R-sig-ME] within-group averaging
Jarrod Hadfield
j.hadfield at ed.ac.uk
Thu Nov 29 10:02:13 CET 2012
Hi Jonas,
Thanks for your reply. Initially, I thought it was solely to do with
inhomogeneity too, and I agree that it plays a role. However, if you
replace
cuty<-cut(y,10) with cuty<-cut(x,10) (i.e. define groups by bins of x
rather than y) then the problems with bias and type-I error rate
disappear despite the same pattern of inhomogeneity existing. This
really surprised me. At the moment it is the bias that I would most
like to fix, rather than the high type-I errors.
Cheers,
Jarrod
Quoting Jonas Klasen <klasen at mpipz.mpg.de> on Wed, 28 Nov 2012 22:17:04 +0100:
> Hi Jarrod,
> I'm not completely sure if I got you right.
>
> It is a combination of variance inhomogeneity and the unbalanced
> treatment variable. In your case the bigger group has the smaller
> variance, which makes the overall variance small and the effects
> significant. If you switch the group sizes, (small group small
> variance and big group large variance) the effect is not significant
> anymore. The first two models differ in R^2 which is a indication
> for the variance inhomogeneity too.
>
> A plot:
> plot(z[which(treatment==1)], x[which(treatment==1)])
> plot(z[which(treatment==0)], xbar[which(treatment==0)])
> #or
> plot(z[which(treatment==0)], x[which(treatment==0)])
> plot(z[which(treatment==1)], xbar[which(treatment==1)])
>
> Regards
> Jonas
> Hi, This is not really a specific question for mixed models but
> I was hoping someone might know the answer anyway. To make things
> simple, imagine you have a chain of causality x->y->z with some
> error at each step: x ~ N(0,1), y ~ N(x,1) and z ~ N(y, 1).
> Observations are made on individuals who are grouped by (this is the
> important bit) the intervals their y values fall between. z is
> observed for each individual in a group. Although all x values are
> observed it is not possible to say which individual within a group
> the values belong too. Therefore, xbar is a vector the same length
> as x where each individual has the x value of its group mean We
> also have a treatment which does not have a causal effect on x, y
> or z, but is associated with extreme values of x. Both
> lm(z~x+treatment) and lm(z~xbar+treatment) give an average
> treatment effect of zero and uniform p-values as expected. However,
> imagine for individuals in the treated group that x values can be
> assigned such that xbar2 takes on values of xbar for the
> non-treated individuals and x for the treated individuals. In this
> case lm(z~xbar2+treatment) provides strong evidence for a treatment
> effect! I had an idea why this would be the case (based on
> differences in variances between xbar and x). However, the problem
> completely disappears if the groups are defined by which interval
> of x they occur in, rather than which interval of y, yet
> differences in variances between xbar and x persist under this
> scenario. Some code is below. If anyone has any ideas what this
> type of problem is called, why it occurs and if there are known
> solutions I would be very glad to know. Cheers, Jarrod
> x<-rnorm(100) y<-rnorm(100, x) z<-rnorm(100, y)
> treatment<-rbinom(100,1, plogis(x-2)) cuty<-cut(y,10) # get 10
> groups defined by y xbar<-tapply(x, cuty, mean)[cuty] xbar2<-xbar
> xbar2[which(treatment==1)]<-x[which(treatment==1)]
> summary(lm(z~x+treatment)) summary(lm(z~xbar+treatment))
> summary(lm(z~xbar2+treatment)) # the treatment effect in model 3 is
> consistently negative and has high type I error. -- The
> University of Edinburgh is a charitable body, registered in
> Scotland, with registration number SC0053
>
> ______________________________________________________
>
> Jonas Klasen
> PhD student
> Genome Plasticity and Computational Genetics
> Max Planck Institute for Plant Breeding Research
> ______________________________________________________
>
>
>
--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.
More information about the R-sig-mixed-models
mailing list