[R-sig-ME] Using lme4 with very limited number of observations makes sense?

Tue Apr 28 15:38:32 CEST 2015

Dear all,

my name is Massimiliano Iraci, I am a PhD student at the University of
Salento (Lecce, Italy) and University of Cologne (Germany). My PhD is on
Phonetics and Phonology and I am working on the kinematics of speech in
Parkinson's Disease.

I am very fresh with statistics and recently I am even switching from SPSS
to R, especially working with linear mixed models (lme4 package).
Unfortunately I am having some doubts about the use of this model with my
data.

In my field, the data acquisition is much complicated because of the
instruments, so eventually I always have few data for few subjects. So, in
order to power-up my data, I am used to record more (5-7) repetitions of
any item of interest.

So, for instance, if I want to focus on the displacement of the lower lip
during the production of a bilabial speech gesture, I consider the
voiced/unvoiced condition, in 2 contexts (singleton/geminate). Thus I will
have 7 repetitions of the same item x 2 conditions (voiced/unvoiced) x 2
contexts (singleton/geminate) x 10 subjects (5 pathological + 5 controls).
I fit the model as follows:
lip_displacement ~ PATvsCTR * condition * context + (1|repetitions) +
(1+condition|subject) + (1+context|subject)

I must highlight that:
- I don't have always 7 repetitions for any item: some subjects were able
to produce 5, some 6, some 7 differing from item to item (so generally
number of "same items" range from 5 to 7);
- 'repetitions' is a variable reporting the cardinal number associated to
the chronological order of the repetitions recorded (so ranging from 1 to
7)

This fit very often generates several errors and warnings ("large
eigenvalue ratio"; "degenerate Hessian with 1 negative eigenvalues"; etc.)
and if plotting the distribution of fitted/residuals I see stripes clearly
because of the repetitions.

Finally the questions are:
- could the repetitions be a problem for the model? Could it better to
work with an average of the repetitions in order to have only 1 value for
each item?
- if the previous is true, does it make sense to compare such a limited
number of values in such a limited number of subjects with this model?

I am sorry for my limited knowledge in statics. I would be really grateful
if you could help me to shed light on the problem. Thank you very much in
advance for your help.

I look forward to hearing from you.
Kind regards,

Massimiliano

===============================================================

Massimiliano Mario Iraci
PhD student

CRIL (Intedisciplinary Center for Research on Language) &
DReAM (Laboratory of Research Applied to Medicine)
University of Salento & Local Health Service (ASL Lecce)
c/o Vito Fazzi Hospital
Piazza Filippo Muratore - 73100 - Lecce (Italy)

web: http://www.cril.unisalento.it/en/staff_details.php?id=123
tel: 0039 - 0832 335008