[R-sig-ME] lme4 with simulated hierarchical data with correlated errors

Tue Apr 1 16:13:38 CEST 2014

  This is not trivial.

  The authors use a completely different estimation approach (and
describe, but do not give code for, their procedure). In Appendix 2 the
authors describe their method for random-effects estimation, which uses
a "feasible GLS estimator" from Verbeek 2000 "A guide to modern
econometrics": "For more details on the computation of the weighting
matrix, see Verbeek (2000), Hsiao (1986) and Baltagi (2001). Several
other random-effects estimation procedures for model (1) are available
that include the iterative GLS (IGLS) approach, (restricted) maximum
likelihood (REML), or Bayesian procedures (see e.g. Goldstein, 1995;
Longford, 1993)."

  It would take considerable work (at least on my part! maybe it's
easy/already known for someone else on the list) to work through and
understand the characteristics of these different estimation methods.

  Maybe it's a good thing that REML as implemented by lme4 is less biased?

  Ben Bolker

On 14-04-01 07:05 AM, Raluca Gui wrote:
> Hello,
> 
> I simulate a multilevel model, with 2 levels,  of the form:
> 
> y_ij=beta_0+beta_1*x_ij+alfa_i+eta_ij
> 
> Individual-level units: i=1,...,150
> Group-level units: j=1,...,10
> The error therms are assumed to follow N(0,1)
> 
> 
> I want to compute the magnitude of the bias of the estimators when there is
> correlation between x_ij and alfa_i and between x_ij and eta_ij.
> 
> I assign the following true values for the beta parameters: beta_0=10,
> beta_1=2. I run 250 replications.
> 
> I study 2 cases, depending on the magnitude of the correlation:
> 
> i)  corr(x_ij, alfa_i) = 0 , corr(x_ij, eta_ij) = 0.3
> ii)  corr(x_ij, alfa_i) = 0.3 , corr(x_ij, eta_ij) = 0.3
> 
> 
> What I am doing is actually a replications of a paper by Ebbes et al.
> (2004). The problem encountered is that I do not get the same magnitude of
> the bias as the authors. I get considerably smaller bias.
> 
> Below are the results of the authors across 250 replications (means and
> std. deviations):
> 
> 
>                                                      Case
> 
> 
> i                       ii
> 
> Fixed Effects          beta_0             -                       -
>                                beta_1       2.43 (0.04)          2.42 (0.04)
> 
> 
> Random Effects     beta_0        7.88 (0.20)          5.79 (0.30)
>                                beta_1       2.42 (0.04)          2.84 (0.06)
>                               sigma_alfa  0.99 (0.13)          0.00 (0.00)
>                               sigma_eta   0.90 (0.03)         0.91 (0.04)
> 
> 
> My code:
> 
> 
> require(lme4)
> require(MASS)
> k <- 10            # total number of firms
> n <- 15            # number of employees within each firm
> j <- k*n        # total number of employees
> beta0=10
> beta1=2
> rho1=0.0001     # this is Case iii
> rho2=0.3
> beta_zero <- rep(NA,250)
> beta_one <- rep(NA, 250)
> for (i in 1:250) {
> c <- cbind(FID = sort(rep(1:k,n)), Fint =rnorm(j,0,1),Z = rnorm(j,0,1))
>             #generate level 2 data
> alfa_i <- rnorm(j, mean=0, 1)
> eta_ij <- rnorm(j, mean=0, 1)
> b <- cbind(EID = 1:j, Eint=rnorm(j,0,1), X = rnorm(j,mean =0, sd
> =1)+eta_ij+alfa_i)  # generate level 1 data
> m1dat <- data.frame(c,b)
> m1dat <- within(m1dat, {
>    EID <- factor(EID)
>    FID <- factor(FID)}
>    )
> outcome <- data.frame(y_ij =
> beta0+beta1*(m1dat[,'X'])+rho1*alfa_i+rho2*eta_ij )
> m1dat <- cbind(m1dat,outcome)
> rand_eff <- lmer(y_ij ~1+X+(1|FID), data=m1dat)
> beta_zero[i] <- fixed.effects(rand_eff)[[1]]
> beta_one[i] <- fixed.effects(rand_eff)[[2]]
> }
>  means <- c(mean(beta_zero),mean(beta_one))
> 
> 
> 
> Thanks in advance
>