[R-sig-ME] Does the “non-independent" data structure defined in mixed models follow the “independency” defined by probability theory?

Mon Sep 5 20:51:13 CEST 2016

On Mon, Sep 5, 2016 at 4:08 AM, Chen, Chun <chun.chen at wur.nl> wrote:
> Dear all,
>
> I am bit puzzled by definition of the “nested data” or “non-independent data” structure in the mixed model.
>
> >From the statistical point of view, independency is defined as the probabilities of selecting two observations are not influencing each other. In this case, if I design an experiment where I on purposely select two observations from the same group (or strata), then later on we can say these two observations are dependent. However, if I am doing a sampling with replacement and by coincidence I selected one observations twice (e.g. throw a dice twice and by coincidence we get both a “6” each time). The probability of selecting these two observations are indeed not influencing each other and they are independent.
>
> My questions are:
>
> What’s the definition of the “non-independent data” that is often referred in mixed modeling? Is it the same concept as “independency” defined by probability theory, which is relevant by how the observations are selected, rather than how the observations look alike in the final sample

   (You say "questions" here, but there really seems to be only one
question here.)

  Yes, mixed modeling defines grouping variables based on
experimental/observational design.  That is, grouping variables are
identifiers that are believed *a priori* to be associated with
non-independence of observations with the same identifier values.

  Ben Bolker