[R-sig-ME] Unbalanced design mixed models

Thu Apr 18 06:08:39 CEST 2013

Hi, I've wondered the same thing.  Others who can actually explain
*why* the estimator of the variance component is consistent even when
sample groups are badly unbalanced will hopefully speak up. I can't
justify it, but I've stared pretty hard at some of the classic
articles on it. The last two below are classics, by a frequent
participant in this list :)  I expect that if you sharpen up your
question a bit, you could ask again and get very sharp answers.

Harville, D. A. (1977). Maximum Likelihood Approaches to Variance
Component Estimation and to Related Problems. Journal of the American
Statistical Association, 72(358), 320–338. doi:10.2307/2286796

Jennrich, R. I., & Schluchter, M. D. (1986). Unbalanced
Repeated-Measures Models with Structured Covariance Matrices.
Biometrics, 42(4), 805–820. doi:10.2307/2530695

Pinheiro, J. C., & Bates, D. M. (1995). Approximations to the
Log-Likelihood Function in the Nonlinear Mixed-Effects Model. Journal
of Computational and Graphical Statistics, 4(1), 12–35.
doi:10.2307/1390625

Lindstrom, M. J., & Bates, D. M. (1988). Newton-Raphson and EM
Algorithms for Linear Mixed-Effects Models for Repeated-Measures Data.
Journal of the American Statistical Association, 83(404), 1014–1022.
doi:10.2307/2290128

pj

On Wed, Apr 17, 2013 at 1:14 PM, Boulanger, Yan
<Yan.Boulanger at rncan-nrcan.gc.ca> wrote:
> Hi folks,
>
> This seems a very (I mean very...) basic mixed model question but I would like to have your feeling about this. I start from scratch with mixed models. I'm fitting this very simple mixed model:
>
> fm1 <- lmer(dbh_tree ~ log(age_tree) + (1|plot_name), PICE.MAR_tree)
>
> where dbh_tree is the diameter at breast height of a tree, age_tree the age of the tree at bh, plot_name is the plot where the tree was sampled and PICE.MAR_tree, my dataset. This is not an experimental setup where a fixed number of trees was sampled per plot. Indeed, some plots have as high as 150 trees (very few...) whereas others has only 1... At that is the (well one of the...) problem. How may I fit a mixed model where, in several cases, well, maybe 50%, there is only 1 tree per level of the random factor ? So, no variation within the random factor... I could forget the random factor but of course, this would lead to "partial" pseudoreplication. On the other hand, I could drop all plots with only one tree but this would discard about half of the plots. Am I right when I say that the coefficients (for the fixed variables) are unbiased when not considering the random factor ? Indeed, I'm not interested in CI but "only" to fixed variable coefficients.
>
> Many thanks,
>
> Yan
>
> Yan Boulanger, Postdoctoral Visiting Fellow
> Ressources Naturelles Canada, Canadian Forest Service
> Centre de Foresterie des Laurentides
> 1055, rue du P.E.P.S.
> C.P. 10380, succ. Sainte-Foy
> Québec (Québec) Canada
> G1V 4C7
> Tel. : +1 418 649-6859
>
>
>
>         [[alternative HTML version deleted]]
>
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>

-- 
Paul E. Johnson
Professor, Political Science      Assoc. Director
1541 Lilac Lane, Room 504      Center for Research Methods
University of Kansas                 University of Kansas
http://pj.freefaculty.org               http://quant.ku.edu