[R-sig-ME] Does the “non-independent" data structure defined in mixed models follow the “independency” defined by probability theory?

Wed Sep 7 14:27:31 CEST 2016

Dear all,

I am a bit surprized by the previous follow-up to this question:

Le 05/09/2016 à 10:08, Chen, Chun a écrit :
> Dear all,
>
> I am bit puzzled by definition of the �nested data� or �non-independent data� structure in the mixed model.
>
> >From the statistical point of view, independency is defined as the probabilities of selecting two observations are not influencing each other. In this case, if I design an experiment where I on purposely select two observations from the same group (or strata), then later on we can say these two observations are dependent. However, if I am doing a sampling with replacement and by coincidence I selected one observations twice (e.g. throw a dice twice and by coincidence we get both a �6� each time). The probability of selecting these two observations are indeed not influencing each other and they are independent.
the assumptions are those of the model being fitted. Thus is a mixed 
model one typically assumes that the _residual errors_ for each 
observation are independent.
The "observations" are not independent, but the model does not assume 
that the "observations" are independent in any elevant stochastic sense, 
only that the residuals are.

I don't see independence in probability as being defined as "not 
influencing each other". In practice independence also means "not 
affected by a common factor". A formal definition of independence of 
several events  is that the joint probability of these events is the 
product of probabilities of each event : see e.g. Feller 1950, p.125. In 
the above example of sampling with replacement, a single draw of a 
residual error affects two response values, so according to the formal 
definition, the two residuals are not independent. So sampling with 
replacement violates the assumptions of independence of residuals.

F.R.
>
> My questions are:
>
> What�s the definition of the �non-independent data� that is often referred in mixed modeling? Is it the same concept as �independency� defined by probability theory, which is relevant by how the observations are selected, rather than how the observations look alike in the final sample
>
> Thanks
>
> Regards,
> Chun
>
> 	[[alternative HTML version deleted]]
>
>
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models