[R-sig-ME] [R] Measuring correlations in repeated measures data

Ben Bolker bbolker at gmail.com
Wed Mar 2 03:29:47 CET 2011


On 11-03-01 06:25 PM, Brant Inman wrote:
> BEN,
> 
> Again, thanks for your input.  You are right that I was looking for
> the correlation matrix.  The issue is that I am not understanding why
> the answer given by Ista's solution is different than your solution
> to the same problem.  Ista suggested making the dataset into a wide
> dataset then calculating the correlations that way.
> 
> ### library(MEMSS) data(Orthodont) library(reshape) c.orth <-
> as.data.frame(cast(Orthodont, Subject + Sex ~ age, 
> value="distance")) names(c.orth)[3:6] <- paste("age",
> names(c.orth)[3:6], sep="_") cor(c.orth[3:6]) ###
> 
> Ista's approach gave a lower triangle of
> 
> 0.6256 0.7108  0.6349 0.5998  0.7593  0.7950
> 
> Your solution gave:
> 
> -0.099 0.021 -0.242 -0.298  0.184  0.262
> 
> 
> Why are these two answers different?
> 
> Brant

 [cc'ing back to r-sig-mixed]

  Our methods are calculating different things.  It's not clear which
one you want.
   Ista's method is calculating the *marginal* ('raw') correlations
between the observed values in different age categories.
   My method is calculating the correlations conditional on the model --
approximately, the correlations of the residuals from the model (in this
case they might be exactly the same as the correlations of the
residuals, in more restricted correlation models such as corAR1
(autoregressive order-1), they're model based.

  Does that help?





>> On 11-02-28 11:59 AM, Brant Inman wrote:
>>> Ben,
>>> 
>>> Thanks for the response.  Your method generates an answer that
>>> is slightly different than what I was looking for.  In the
>>> Orthodont dataset there are 4 age groups (8, 10, 12, 14).  I
>>> would like to calculate the correlation of "distance" for all
>>> combinations of the categorical variable "age".  The anticipated
>>> output would therefore be a matrix with 4 columns and 4 rows and
>>> a diagonal of ones.
>>> 
>>> For example, in such a table I would be able to look at the mean
>>> within individual correlation coefficient for distance b/t ages 8
>>> and 10 or, alternatively, ages 10 and 14.  Is there a function in
>>> nlme or lme4 that does this?
>> 
>> Given the model below,
>> 
>> fit2$modelStruct$corStruct
>> 
>> produces
>> 
>> Correlation structure of class corSymm representing Correlation: 1
>> 2      3 2 -0.099 3  0.021 -0.242 4 -0.298  0.184  0.262
>> 
>> (this is also shown at the end of summary(fit2))
>> 
>> This is the lower triangle of the (symmetric) correlation matrix;
>> the diagonal is 1 by definition.
>> 
>> Isn't that what you're looking for? (Sorry if I'm
>> misunderstanding.)
>> 
>> Ben
>> 
>>> 
>>> Brant
>>> 
>>> On Feb 28, 2011, at 02:24 AM, Ben Bolker <bbolker at gmail.com>
>>> wrote:
>>> 
>>>> Brant Inman <brant.inman <at> mac.com <http://mac.com>>
>>>> writes:
>>>> 
>>>>> 
>>>>> R-helpers:
>>>>> 
>>>>> I would like to measure the correlation coefficient between
>>>>> the repeated
>>>> measures of a single variable
>>>>> that is measured over time and is unbalanced. As an example,
>>>> consider the Orthodont dataset from package
>>>>> nlme, where the model is:
>>>>> 
>>>>> fit <- lmer(distance ~ age + (1 | Subject), data=Orthodont)
>>>>> 
>>>>> I would like to measure the correlation b/t the variable
>>>>> "distance" at
>>>> different ages such that I would have
>>>>> a matrix of correlation coefficients like the following:
>>>>> 
>>>>> age08 age09 age10 age11 age12 age13 age14 age08 1 age09 1 
>>>>> age10 1 age11 1 age12 1 age13 1 age14 1
>>>>> 
>>>>> The idea would be to demonstrate that the correlations b/t 
>>>>> repeated measures of the variable "distance" decrease as the
>>>>> time b/t measures increases For example, one might expect the
>>>>> correlation coefficient b/t age08 and age09 to be higher
>>>>> than that between age08 and age14.
>>>>> 
>>>> 
>>>> This stuff is not currently possible in lmer/lme4 but is easy
>>>> in nlme:
>>>> 
>>>> library(nlme) Orthodont$age0 <- Orthodont$age/2-3 ## later code
>>>> requires a time index of consecutive integers ## (which
>>>> apparently must also start at 1, although not stated)
>>>> 
>>>> fit <- lme(distance~age,random=~1|Subject,data=Orthodont)
>>>> 
>>>> ## compute autocorrelation on the basis of lag only, plot a <-
>>>> ACF(fit) plot(a,alpha=0.05)
>>>> 
>>>> 
>>>> fit2 <- update(fit, correlation=corSymm(form=~age0|Subject)) 
>>>> fit3 <- update(fit, correlation=corAR1(form=~age0|Subject))
>>>> 
>>>> AIC(fit,fit2,fit3) ## at least on the basis of AIC, this extra
>>>> complexity is ## not warranted
>>>> 
>>>> anova(fit,fit2) ## likelihood ratio test
>>>> 
>>>> ______________________________________________ 
>>>> R-help at r-project.org <mailto:R-help at r-project.org> mailing
>>>> list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do
>>>> read the posting guide 
>>>> http://www.R-project.org/posting-guide.html and provide
>>>> commented, minimal, self-contained, reproducible code.
>> 
>> ______________________________________________ R-help at r-project.org
>> mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do
>> read the posting guide http://www.R-project.org/posting-guide.html 
>> and provide commented, minimal, self-contained, reproducible code.
>




More information about the R-sig-mixed-models mailing list