[R-sig-ME] corAR1 question and df

Nicolas Fanin nicolas.fanin at hotmail.fr
Thu Aug 3 21:05:18 CEST 2017


Hey everyone,


Thank you for your example, this is interesting.

I have a question related to corAR and the inclusion of Time as a fixed factor and its effect on the degree of freedom.


Here is a classic model:

model= lme (VarX ~ Tr*Eco, random = ~1|Subject,  correlation = corAR1 (form=~as.numeric(Time)|Subject))


                    nDF  dDF  F-value p-value
(Intercept)     1  3349  878.7115  <.0001
Tr                    6   149   23.2416  <.0001
Eco                 2    27     0.3317  0.7206
Tr:Eco           12   149    6.1074  <.0001


Denominator DF is of 149 and 27 respectively, and that's OK.

But if I use time as a fixed factor, it seems that it does not consider the autocorrelation structure.


model= lme (VarX ~ Tr*Eco*Time, random = ~1|Subject,  correlation = corAR1 (form=~as.numeric(Time)|Subject))


                     nDF dDF  F-value p-value
(Intercept)          1  3300 966.2113  <.0001
Tr                         6   149  23.3076  <.0001
Eco                       2    27   0.5112  0.6055
Time                    1  3300 584.3890  <.0001
Tr:Eco                  12   149   6.5223  <.0001
Tr:Time               6  3300  17.7468  <.0001
Eco:Time             2  3300  0.3548  0.7013
Tr:Eco:Time       12  3300   3.6850  <.0001


The explanation is likely logic, but I do not know if this is correct even though the results are similar for Tr and Eco.

If you have any idea or thoughts about this topic, I'd be greatful.


And good vacation for the lucky ones.






________________________________
De : R-sig-mixed-models <r-sig-mixed-models-bounces at r-project.org> de la part de Ben Bolker <bbolker at gmail.com>
Envoyé : jeudi 3 août 2017 19:23
À : r-sig-mixed-models at r-project.org
Objet : Re: [R-sig-ME] corAR1 question


  Good point.  If you want to use non-adjacent time points you might try
corCAR1 (continuous-time first-order autoregressive ...)

On 17-08-03 01:40 PM, Ben Pelzer wrote:
> Dear list,
>
> After some searching  I think I know which coding of the time covariate
> corAR1 expects ... and where the strange results come from in case a
> coding system like 2, 4, 6, 8 is used.
>
> For anyone interested, here are my findings.
>
> The thing is that in Pinheiro and Bates, it is not very explicitly told
> (and for anywhone familiar with AR1 models, this might not be necessary
> at all...) how the time variable one uses has to coded, in case of corAR1.
>
> Suppose we have 4 time points. Then one would normally code: 1, 2, 3, 4
> (or 1, 2, 4 if a person has a missing for the third time point). One
> could also use 10,11,12,13 as time values, this would lead to exactly
> the same results. Note that adjacent integer values must be used, no
> gaps in between are allowed.
>
> However, with the "wave2" values 2, 4, 6, 8 for each person, there are
> no observations in the data for the intermediate timepoints 3, 5, 7! And
> this probably causes a problem and unexpected results like a different
> loglikelihood and a correlation phi1=0.
>
> My initial idea was that with the time intervals being twice as large
> for "wave2" ("wave2" is coded as 2, 4, 6, 8, whereas  "wave" was coded
> as 1, 2, 3, 4) than for "wave", the only difference between using
> "wave2" and "wave" in
>
> correlation=corAR1(form = ~ wave|id)
>
> versus
>
> correlation=corAR1(form = ~ wave|id)
>
> would be another value for the estimated correlation phi, but that both
> specification would have the same loglikelihood and fixed effects (only
> the intercept in my example). Now this is exactly what is found when
> running:
>
> correlation=corCAR1(form = ~ wave2|id)
>
> With this continuous-time corCAR1 specification, the loglikelihood is
> exactly equal to the one of the corAR1(form = ~wave|id) and what is
> more: the estimated phi value equals the square root of the phi value
> corAR1(form = ~wave|id). And this makes sense! If "wave3" would be coded
> as 3, 6, 9, 12
> then corCAR1(form= ~wave3|id) would again lead to the same loglikelihood
> as for corAR1(form=~wave|id) and the phi value now equals the third root
> of the phi in corAR1(form = ~wave|id).
>
> Kind regards,
>
> Ben.
>
>
>
>
> On 3-8-2017 14:07, Ben Pelzer wrote:
>> Sorry list, stupid me, again a mistake, I've been working too long I
>> guess....
>> Below, I'll give it a new try....
>>
>> The frequency table of "wave"  was again incorrect, as it should have
>> n=35 for each of its values 1, 2, 3, 4.
>>
>> Thansk for your patience.
>>
>> *-----------------------------
>>
>> Dear list,
>>
>> In dataframe "opposites" there are data from 35 individuals observed at
>> each of 4 consecutive timepoints, leading to 140 cases in total. The
>> time variable is called "wave", which takes values  1, 2, 3 and 4 for
>> each individual.
>>
>> With the folowing syntax, I've tried to estimate a model with the corr.
>> matrix of the residuals having a AR1 structure:
>>
>> opposites <-
>> read.table("https://stats.idre.ucla.edu/stat/r/examples/alda/data/opposites_pp.txt",header=TRUE,sep=",")
>>
>> library(nlme)
>>
>> table(opposites$wave)
>> auto1 <- gls(opp~1,opposites, correlation=corAR1(form = ~ wave|id),
>> method="REML")
>> summary(auto1)
>> corMatrix(auto1$modelStruct$corStruct)[[5]]
>>
>> This seems to work fine, and produces as output:
>>
>>> table(opposites$wave)
>>   1  2  3  4
>> 35 35 35 35
>>
>>> auto1 <- gls(opp~1,opposites, correlation=corAR1(form = ~ wave|id),
>> method="REML") > summary(auto1)
>> Generalized least squares fit by REML
>>    Model: opp ~ 1
>>    Data: opposites
>>         AIC      BIC    logLik
>>    1405.478 1414.282 -699.7392
>>
>> Correlation Structure: AR(1)
>>   Formula: ~wave | id
>>   Parameter estimate(s):
>>      Phi
>> 0.75515
>>
>> Coefficients:
>>                 Value Std.Error  t-value p-value
>> (Intercept) 205.0916  7.153622 28.66962       0
>>
>> Standardized residuals:
>>           Min           Q1          Med           Q3          Max
>> -2.561070599 -0.666429529 -0.001817216  0.518961088  2.081296002
>>
>> Residual standard error: 50.40533
>> Degrees of freedom: 140 total; 139 residual
>>
>>> corMatrix(auto1$modelStruct$corStruct)[[5]]
>>            [,1]      [,2]      [,3]      [,4]
>> [1,] 1.0000000 0.7551500 0.5702515 0.4306254
>> [2,] 0.7551500 1.0000000 0.7551500 0.5702515
>> [3,] 0.5702515 0.7551500 1.0000000 0.7551500
>> [4,] 0.4306254 0.5702515 0.7551500 1.0000000
>>
>>
>> Next, I multiplied the "wave" variable by 2 to try to understand which
>> influence this would have on the estimate of phi parameter. This
>> resulted in:
>>
>>> opposites$wave2 <- 2*opposites$wave > table(opposites$wave2)
>>   2  4  6  8
>> 35 35 35 35
>>
>>> auto2 <- gls(opp~1,opposites, correlation=corAR1(form = ~ wave2|id),
>> method="REML") > summary(auto2)
>> Generalized least squares fit by REML
>>    Model: opp ~ 1
>>    Data: opposites
>>       AIC      BIC    logLik
>>    1473.5 1482.303 -733.7498
>>
>> Correlation Structure: ARMA(1,0)
>>   Formula: ~wave2 | id
>>   Parameter estimate(s):
>> Phi1
>>     0
>>
>> Coefficients:
>>                 Value Std.Error  t-value p-value
>> (Intercept) 204.8143  3.940234 51.98023       0
>>
>> Standardized residuals:
>>           Min           Q1          Med           Q3          Max
>> -2.762981486 -0.714569461  0.003983449  0.567028639  2.256164210
>>
>> Residual standard error: 46.62148
>> Degrees of freedom: 140 total; 139 residual
>>> corMatrix(auto2$modelStruct$corStruct)[[5]]
>>       [,1] [,2] [,3] [,4]
>> [1,]    1    0    0    0
>> [2,]    0    1    0    0
>> [3,]    0    0    1    0
>> [4,]    0    0    0    1
>>
>> My question is about this second output with wave2 as the time
>> indicator. I do not see why the Phi1 estimate has now changed to zero,
>> as if there would be no residual correlation at all between the time
>> points. I must admit that I was not sure which value to expect for phi1.
>>
>> A related question is about using corAR1 in case the individuals in my
>> data would have been observed at different timepoints. E.g. person A at
>> day 1, 2, 3, 4 but person B at day 1 , 4, 5, 8. Can corAR1 still be used
>> given such unequally spaced data? Some documents on the internet say
>> that in such situation corCAR1 should be used? From Bates and Pinhiero's
>> book it seems to me that corCAR1 is applicable in case of continuously
>> code time points, like 1.7, 2.5 etc., while for integer coded time the
>> corAR1 method can still be used (this apart form the fact that corAR1
>> can also give negative phi values in contrast with corCAR1).
>>
[[elided Hotmail spam]]
>>
>> Ben Pelzer.
>>
>> _______________________________________________
>> R-sig-mixed-models at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

	[[alternative HTML version deleted]]



More information about the R-sig-mixed-models mailing list