[R-sig-ME] Semantics question regarding "random effect" vs. "random effects"

Tue Jul 23 23:32:35 CEST 2013

On Jul 23, 2013, at 07:36 , Jake Westfall wrote:

> Hi Jeremy,
> 
> Both seem basically acceptable to me, but I prefer the latter, that is, the plural form.
> 
> Let's say we have some data with clusters j = 1,2,...,N and our model is: y_ij = b_0 + a_0j + e_ij. If someone talks about the variance of the "random effect" (singular), I take this to mean that they are talking about var(a_0j). So I guess the singular usage here comes from the fact that "a_0j" constitutes one and only one term in the written model equation. Fair enough.
> 
> But I prefer to think of it as: we have N random effects -- a different value of a_0 for each cluster -- and we can speak about the variance of these N values. Of course that the "variance of the random effects" in the model output will not literally equal the sample variance of the observed a_0js. But in any case we are talking about the variance of some quantity of things of which there are presumably more than one.

I think that is a red herring. The random effects are random variables which have a distribution and variance, each. You really do not want to mix that up with the empirical variance of the (latent!) realized values of the random variables. For a start, you may not even have identical replicates from which to calculate a variance -- consider for instance the case where var(a_0j) depends on j. Even when there are, the variance estimate is not an estimate of the variance of the realized values; rather, if you could obtain the realized values, the variance of those would estimate the variance of the random effect(s).

To me, it is more  a question of focus: Do we talk about one individual observation y_ij and its constituent deterministic and random effects, or do we talk about the entire collection of observations and random effects? In the former case, each observation contains one random effect of cluster, shared with all other members of the same cluster. 

In the latter case, there are multiple cluster effects, namely one per cluster, so it makes sense to talk about the random effect_s_. However, you then ought also to talk about their variance_s_, at least until a modeling assumption of identical variances is stipulated.

My preferences tend towards the observation-centric view, I suppose. 

- Peter D.

> 
> My two cents,
> Jake
> 
> Date: Mon, 22 Jul 2013 22:00:38 -0700
> From: helixed2 at yahoo.com
> To: r-sig-mixed-models at r-project.org
> Subject: [R-sig-ME] Semantics question regarding "random effect" vs. "random	effects"
> 
> Imagine a basic two-level (varying intercept) mixed-effects model, such as test scores for students clustered in classrooms.  Packages like lme4 provide an estimate of the higher-level variance.
> 
> When reporting that higher-level variance, I have seen authors mention both:
> 
> 1. the variance of the "random effect"
> 
> or alternatively:
> 
> 2. the variance of the "random effects"
> 
> I know which seems right to me, but I'd be curious to hear opinions on the proper way to write up commentary on the higher-level variance.
> 	[[alternative HTML version deleted]]
> 
> 
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models 		 	   		  
> 	[[alternative HTML version deleted]]
> 
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com