[R-sig-ME] Group size in mixed/multilevel model: How to obtain weighted effects.

Henrik Singmann henrik.singmann at psychologie.uni-freiburg.de
Sun Aug 10 18:16:38 CEST 2014

Hi Jake,

I fully agree that what the model does is not necessarily wrong. It may well be that the groups should be treated equally. However in my case, each data point corresponds to one participant and each group to one study (with somewhat different characteristics). Hence, I would like to obtain the weighted estimate and rather treat each data point equal (and not each group).

Your idea of specifically modeling group size is quite appealing. I am however somewhat unsure on how to interpret the "grand mean of the group sizes". Is this the "weighted effect estimate" in which each data point (i.e., participant) is weighted equally? Given that this is how the grand mean of the group size is created it seems to be the case.

Thanks for the pointer to the paper. Will read it carefully.


PS: For interested readers, this is how to obtain the estimate with "grand mean of group sizes":

dat <- within(dat, {
   size <- table(group)[paste(group)]
   ivC <- iv - mean(iv)
   sizeC <- size - mean(size)  # "mean(sizes)" is the grand mean of group sizes
summary(lmer(dv ~ ivC*sizeC + (ivC|group), data=dat))

Am 10.08.2014 01:13, schrieb Jake Westfall:
> Hi Henrik,
> It's not obvious to me that what the model is doing is necessarily "wrong." In your data, the effect of IV on DV depends on group size. One possibility is to model this relationship explicitly by including group size as predictor in the model which interacts with IV.
> dat <- within(dat, {
>    size <- table(group)[paste(group)]
>    ivC <- iv - mean(iv)
>    sizeC <- size - mean(table(group))
> })
> summary(lmer(dv ~ ivC*sizeC + (ivC|group), data=dat))
> There is a size*IV interaction, t = 2.39. At the average group size (n=46) there is a negative effect of IV on DV, t = -2.82. At the grand mean of the group sizes (n=85; larger groups have more influence here), there is not a reliable relationship, t = -0.62.
> You could also try looking at separating out the within-group and between-group effects of the IV, as described by Bell & Jones (2014) and others.
> Bell, A., & Jones, K. (2014). Explaining fixed effects:
> Random effects modeling of time-series cross-sectional and panel data. Political Science Research and Methods.
> Jake
>> Date: Sat, 9 Aug 2014 20:12:14 +0200
>> From: henrik.singmann at psychologie.uni-freiburg.de
>> To: r-sig-mixed-models at r-project.org
>> Subject: [R-sig-ME] Group size in mixed/multilevel model: How to obtain	weighted effects.
>> Dear list,
>> I have a relatively basic question regarding the influence of group size in a simple mixed model with 1 grouping factor (i.e., a one-level multilevel model). My data has one numerical DV, one numerical IV, and one grouping factor with 19 levels. Importantly, the number of observations in each group dramatically differs, from 5 to 136 (complete data and code is given below).
>> My problem is that it seems that (a) only the small groups show an effect and not the large groups and furthermore (b) results of a simple mixed model are not taking into account that the larger groups do not show an effect but tend to reflect something like an unweighted means (i.e., weighing the effect of each groups identically).
>> In other words: When looking at the data the overall or weighted mean (i.e., not taking grouping into account) and overall correlation between DV and IV are basically 0: Mean = -0.2 and r = -0.04.
>> In contrast, unweighted means (i.e., same weight for each group) show rather strong effects: Mean = -1.1 and r = -.26. Now it seems that the mixed model points strongly towards the unweighted means although there are dramatic differences in group sizes. The estimates mean is -1.0 and the estimated effect of the IV is also substantial.
>> After removing the four smallest groups which amount to less than 4% of all data points and are basically the only ones showing a dramatic effect, the values become much more reasonable. Estimated mean = -0.5 and effect of IV also smaller.
>> My question is what to do in such a situation:
>> - Is it a good reasons to remove small groups because of this?
>> - Is there a way to take group size into account like in a meta-analysis?
>> - Is there literature discussing this issue?
>> Thanks in advance,
>> Henrik
>> ###### Complete example code ######
>> require(lattice)  # for plot of data
>> require(lme4)
>> require(plyr)  # for unweighted means
>> dat <- read.table("http://pastebin.com/raw.php?i=KiQ1kkew")
>> #plot data
>> dat_print <- within(dat, levels(group) <- paste0(levels(group), ", n= ", table(group)))
>> xyplot(dv ~ iv|group, dat_print, panel = function(x, y) {
>>            panel.xyplot(x, y)
>>            panel.abline(lm(y ~ x))
>>          })  # number is group size
>> # weighted means
>> cor(dat$dv, dat$iv)
>> mean(dat$dv)
>> # unweighted means:
>> mean(daply(dat, .(group), function(x) cor(x$dv, x$iv)))
>> mean(daply(dat, .(group), function(x) mean(x$dv)))
>> # full model:
>> summary(lmer(dv~I(scale(iv, scale=FALSE))+(iv|group), dat))
>> # model with small groups removed:
>> groups_exclude <- c("b", "c", "d", "s")
>> ndat <- dat[!(dat$group %in% groups_exclude),]
>> summary(lmer(dv~I(scale(iv, scale=FALSE))+(iv|group), ndat))
>> --
>> Dr. Henrik Singmann
>> PostDoc
>> Albert-Ludwigs-Universität Freiburg, Germany
>> http://www.psychologie.uni-freiburg.de/Members/singmann
>> _______________________________________________
>> R-sig-mixed-models at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> 	[[alternative HTML version deleted]]

Dr. Henrik Singmann
Albert-Ludwigs-Universität Freiburg, Germany

More information about the R-sig-mixed-models mailing list