[R-sig-ME] corAR1 question

Ben Bolker bbolker at gmail.com
Thu Aug 3 19:23:27 CEST 2017


  Good point.  If you want to use non-adjacent time points you might try
corCAR1 (continuous-time first-order autoregressive ...)

On 17-08-03 01:40 PM, Ben Pelzer wrote:
> Dear list,
> 
> After some searching  I think I know which coding of the time covariate
> corAR1 expects ... and where the strange results come from in case a
> coding system like 2, 4, 6, 8 is used.
> 
> For anyone interested, here are my findings.
> 
> The thing is that in Pinheiro and Bates, it is not very explicitly told
> (and for anywhone familiar with AR1 models, this might not be necessary
> at all...) how the time variable one uses has to coded, in case of corAR1.
> 
> Suppose we have 4 time points. Then one would normally code: 1, 2, 3, 4
> (or 1, 2, 4 if a person has a missing for the third time point). One
> could also use 10,11,12,13 as time values, this would lead to exactly
> the same results. Note that adjacent integer values must be used, no
> gaps in between are allowed.
> 
> However, with the "wave2" values 2, 4, 6, 8 for each person, there are
> no observations in the data for the intermediate timepoints 3, 5, 7! And
> this probably causes a problem and unexpected results like a different
> loglikelihood and a correlation phi1=0.
> 
> My initial idea was that with the time intervals being twice as large
> for "wave2" ("wave2" is coded as 2, 4, 6, 8, whereas  "wave" was coded
> as 1, 2, 3, 4) than for "wave", the only difference between using
> "wave2" and "wave" in
> 
> correlation=corAR1(form = ~ wave|id)
> 
> versus
> 
> correlation=corAR1(form = ~ wave|id)
> 
> would be another value for the estimated correlation phi, but that both
> specification would have the same loglikelihood and fixed effects (only
> the intercept in my example). Now this is exactly what is found when
> running:
> 
> correlation=corCAR1(form = ~ wave2|id)
> 
> With this continuous-time corCAR1 specification, the loglikelihood is
> exactly equal to the one of the corAR1(form = ~wave|id) and what is
> more: the estimated phi value equals the square root of the phi value
> corAR1(form = ~wave|id). And this makes sense! If "wave3" would be coded
> as 3, 6, 9, 12
> then corCAR1(form= ~wave3|id) would again lead to the same loglikelihood
> as for corAR1(form=~wave|id) and the phi value now equals the third root
> of the phi in corAR1(form = ~wave|id).
> 
> Kind regards,
> 
> Ben.
> 
> 
> 
> 
> On 3-8-2017 14:07, Ben Pelzer wrote:
>> Sorry list, stupid me, again a mistake, I've been working too long I
>> guess....
>> Below, I'll give it a new try....
>>
>> The frequency table of "wave"  was again incorrect, as it should have
>> n=35 for each of its values 1, 2, 3, 4.
>>
>> Thansk for your patience.
>>
>> *-----------------------------
>>
>> Dear list,
>>
>> In dataframe "opposites" there are data from 35 individuals observed at
>> each of 4 consecutive timepoints, leading to 140 cases in total. The
>> time variable is called "wave", which takes values  1, 2, 3 and 4 for
>> each individual.
>>
>> With the folowing syntax, I've tried to estimate a model with the corr.
>> matrix of the residuals having a AR1 structure:
>>
>> opposites <-
>> read.table("https://stats.idre.ucla.edu/stat/r/examples/alda/data/opposites_pp.txt",header=TRUE,sep=",")
>>
>> library(nlme)
>>
>> table(opposites$wave)
>> auto1 <- gls(opp~1,opposites, correlation=corAR1(form = ~ wave|id),
>> method="REML")
>> summary(auto1)
>> corMatrix(auto1$modelStruct$corStruct)[[5]]
>>
>> This seems to work fine, and produces as output:
>>
>>> table(opposites$wave)
>>   1  2  3  4
>> 35 35 35 35
>>
>>> auto1 <- gls(opp~1,opposites, correlation=corAR1(form = ~ wave|id), 
>> method="REML") > summary(auto1)
>> Generalized least squares fit by REML
>>    Model: opp ~ 1
>>    Data: opposites
>>         AIC      BIC    logLik
>>    1405.478 1414.282 -699.7392
>>
>> Correlation Structure: AR(1)
>>   Formula: ~wave | id
>>   Parameter estimate(s):
>>      Phi
>> 0.75515
>>
>> Coefficients:
>>                 Value Std.Error  t-value p-value
>> (Intercept) 205.0916  7.153622 28.66962       0
>>
>> Standardized residuals:
>>           Min           Q1          Med           Q3          Max
>> -2.561070599 -0.666429529 -0.001817216  0.518961088  2.081296002
>>
>> Residual standard error: 50.40533
>> Degrees of freedom: 140 total; 139 residual
>>
>>> corMatrix(auto1$modelStruct$corStruct)[[5]]
>>            [,1]      [,2]      [,3]      [,4]
>> [1,] 1.0000000 0.7551500 0.5702515 0.4306254
>> [2,] 0.7551500 1.0000000 0.7551500 0.5702515
>> [3,] 0.5702515 0.7551500 1.0000000 0.7551500
>> [4,] 0.4306254 0.5702515 0.7551500 1.0000000
>>
>>
>> Next, I multiplied the "wave" variable by 2 to try to understand which
>> influence this would have on the estimate of phi parameter. This
>> resulted in:
>>
>>> opposites$wave2 <- 2*opposites$wave > table(opposites$wave2)
>>   2  4  6  8
>> 35 35 35 35
>>
>>> auto2 <- gls(opp~1,opposites, correlation=corAR1(form = ~ wave2|id), 
>> method="REML") > summary(auto2)
>> Generalized least squares fit by REML
>>    Model: opp ~ 1
>>    Data: opposites
>>       AIC      BIC    logLik
>>    1473.5 1482.303 -733.7498
>>
>> Correlation Structure: ARMA(1,0)
>>   Formula: ~wave2 | id
>>   Parameter estimate(s):
>> Phi1
>>     0
>>
>> Coefficients:
>>                 Value Std.Error  t-value p-value
>> (Intercept) 204.8143  3.940234 51.98023       0
>>
>> Standardized residuals:
>>           Min           Q1          Med           Q3          Max
>> -2.762981486 -0.714569461  0.003983449  0.567028639  2.256164210
>>
>> Residual standard error: 46.62148
>> Degrees of freedom: 140 total; 139 residual
>>> corMatrix(auto2$modelStruct$corStruct)[[5]]
>>       [,1] [,2] [,3] [,4]
>> [1,]    1    0    0    0
>> [2,]    0    1    0    0
>> [3,]    0    0    1    0
>> [4,]    0    0    0    1
>>
>> My question is about this second output with wave2 as the time
>> indicator. I do not see why the Phi1 estimate has now changed to zero,
>> as if there would be no residual correlation at all between the time
>> points. I must admit that I was not sure which value to expect for phi1.
>>
>> A related question is about using corAR1 in case the individuals in my
>> data would have been observed at different timepoints. E.g. person A at
>> day 1, 2, 3, 4 but person B at day 1 , 4, 5, 8. Can corAR1 still be used
>> given such unequally spaced data? Some documents on the internet say
>> that in such situation corCAR1 should be used? From Bates and Pinhiero's
>> book it seems to me that corCAR1 is applicable in case of continuously
>> code time points, like 1.7, 2.5 etc., while for integer coded time the
>> corAR1 method can still be used (this apart form the fact that corAR1
>> can also give negative phi values in contrast with corCAR1).
>>
>> Thanks for any help!
>>
>> Ben Pelzer.
>>
>> _______________________________________________
>> R-sig-mixed-models at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> 
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models



More information about the R-sig-mixed-models mailing list