[R-sig-ME] Multivariate multilevel analysis with nlme
Corentin Gonthier
corentin.gonthier at univ-savoie.fr
Wed Feb 12 13:25:37 CET 2014
Dear R and statistics experts,
I am currently trying to analyze a dataset which I think requires a
multivariate multilevel analysis. I am unsure both of the appropriate
model and of how to fit it with R. Any insight about any aspect of the
analysis (be it the multivariate part, the multilevel part, the R
implementation...) will be extremely appreciated - even a cursory glance
to tell me if something looks wrong would be a big help.
I come from a psychology background and I am new to multilevel modeling;
despite reading a lot of material about this type of analyses my
understanding of the math is simply too superficial to be certain about
what I am doing. I would really need your help to check whether my
analysis is correct: if you're reading this, you are definitely more
skillful than I am!
***
Our study is about whether the architectural design of a clinic will
influence symptoms of a pathology for permanent residents in this clinic.
We have collected data on 13 symptoms for 8 patients per clinic in 21
clinics.
Our hypothesis is that overall, architecture has an effect on
symptomatology. There is a clinic-level IV (architecture) and we also
control for a patient-level IV (medication). We expect a cross-level
interaction.
The 13 symptom DVs are correlated +.20 on average, which I think means a
multivariate analysis is appropriate. It also seems necessary to
mitigate the risk of type I errors.
I use nlme because as far as I understand, lmer doesn't have the option
to correlate residuals as necessary for a multivariate analysis.
***
To run the multivariate analysis with nlme I have standardized my DVs,
stacked the 13 DVs in a single column, and added a dummy variable to
flag which row corresponds to which symptom. This dummy variable is
categorical rather than continuous so as to prevent R from running a
growth-curve-like analysis.
The data looks like this:
Clinic Patient Symptom Score Medication Architecture
1 1 EP1 -0.12 1
3.2
1 1 EP2 0.11 1 3.2
1 1 EP3 0.13 1 3.2
1 2 EP1 0.56 4 3.2
1 2 EP2 0.67 4 3.2
1 2 EP3 0.23 4 3.2
2 3 EP1 -0.22 3 5.1
2 3 EP2 0.25 3 5.1
2 3 EP3 0.14 3 5.1
2 4 EP1 0.78 6 5.1
2 4 EP2 0.89 6 5.1
2 4 EP3 -0.11 6 5.1
***
- To run the analysis as multivariate, I use both "Symptom" and the
"Symptom:Architecture" interaction as IVs and I remove the intercept in
both the fixed and stochastic parts of the model. I do not include the
main effect of "Architecture" as an IV.
- The effect of patient-level variables should be the same within all
clinics, so there is no random effect for "Medication".
- I do not want to constrain equality between the effect of
"Architecture" on the different Symptoms.
- Due to the multivariate nature of the analysis, I also expect the
residuals to be correlated, with different correlations between the 13
different Symptoms; therefore I specify the covariance structure of
residuals as "corSymm" (non-zero but unstructured and not invariant
across Symptoms, if I get this correctly). This gives :
correlation = corSymm(form = ~1|Patient/Clinic)
- I also expect heteroscedasticity between the different symptoms (there
should be more variance on certain Symptoms), so I add the following
option :
weights = varIdent(form = ~1|Symptom).
***
The model I come up with in nlme is therefore the following :
model1 = lme(fixed = Score ~ Symptom + Medication:Symptom +
Architecture:symptom + Medication:Architecture:Symptom - 1,
+ random = ~ Symptom - 1 | Patient/clinic,
+ correlation = corSymm(form= ~ 1|Patient/Clinic),
+ weights=varIdent(form= ~ 1|Symptom)
+ method = "ML")
In order to test the effect of the architectural variable, I then
compare this model to the following constrained model, dropping all the
terms related to architecture:
model2 = lme(fixed = Score ~ Symptom + Medication:Symptom - 1,
+ random = ~ Symptom - 1 | Patient/clinic,
+ correlation = corSymm(form= ~ 1|Patient/Clinic),
+ weights=varIdent(form= ~ 1|Symptom)
+ method = "ML")
I compare the two models with the command
anova(model1, model2)
and look at the likelihood ratio test. I have read that this is not the
best way to test the significance of fixed effects, but I don't see
another way to run the analysis as multivariate.
Overall, does this look correct to you?
Thank you so much for your invaluable insight!
Best wishes,
--Corentin Gonthier
--
Ph.D. Student in cognitive neuroscience
Department of Psychology and NeuroCognition
University of Grenoble
France
More information about the R-sig-mixed-models
mailing list