[R-sig-ME] corAR1 question

Thu Aug 3 13:59:17 CEST 2017

Dear list,

There was a typo in my previous mail, regarding the frequency table of wave2 at the top. Here is my question once again. Sorry!!

Ben.

*--------------------

Dear list,

In dataframe "opposites" there are data from 35 individuals observed at
each of 4 consecutive timepoints, leading to 140 cases in total. The
time variable is called "wave", which takes values  1, 2, 3 and 4 for
each individual.

With the folowing syntax, I've tried to estimate a model with the corr.
matrix of the residuals having a AR1 structure:

opposites <-
read.table("https://stats.idre.ucla.edu/stat/r/examples/alda/data/opposites_pp.txt",header=TRUE,sep=",")
library(nlme)

table(opposites$wave)
auto1 <- gls(opp~1,opposites, correlation=corAR1(form = ~ wave|id),
method="REML")
summary(auto1)
corMatrix(auto1$modelStruct$corStruct)[[5]]

This seems to work fine, and produces as output:

> table(opposites$wave)
   1  3  5  4
35 35 35 35

> auto1 <- gls(opp~1,opposites, correlation=corAR1(form = ~ wave|id), 
method="REML") > summary(auto1)
Generalized least squares fit by REML
    Model: opp ~ 1
    Data: opposites
         AIC      BIC    logLik
    1405.478 1414.282 -699.7392

Correlation Structure: AR(1)
   Formula: ~wave | id
   Parameter estimate(s):
      Phi
0.75515

Coefficients:
                 Value Std.Error  t-value p-value
(Intercept) 205.0916  7.153622 28.66962       0

Standardized residuals:
           Min           Q1          Med           Q3          Max
-2.561070599 -0.666429529 -0.001817216  0.518961088  2.081296002

Residual standard error: 50.40533
Degrees of freedom: 140 total; 139 residual

> corMatrix(auto1$modelStruct$corStruct)[[5]]
            [,1]      [,2]      [,3]      [,4]
[1,] 1.0000000 0.7551500 0.5702515 0.4306254
[2,] 0.7551500 1.0000000 0.7551500 0.5702515
[3,] 0.5702515 0.7551500 1.0000000 0.7551500
[4,] 0.4306254 0.5702515 0.7551500 1.0000000

Next, I multiplied the "wave" variable by 2 to try to understand which
influence this would have on the estimate of phi parameter. This
resulted in:

> opposites$wave2 <- 2*opposites$wave > table(opposites$wave2)
   2  4  6  8
35 35 35 35
> auto2 <- gls(opp~1,opposites, correlation=corAR1(form = ~ wave2|id), 
method="REML") > summary(auto2)
Generalized least squares fit by REML
    Model: opp ~ 1
    Data: opposites
       AIC      BIC    logLik
    1473.5 1482.303 -733.7498

Correlation Structure: ARMA(1,0)
   Formula: ~wave2 | id
   Parameter estimate(s):
Phi1
     0

Coefficients:
                 Value Std.Error  t-value p-value
(Intercept) 204.8143  3.940234 51.98023       0

Standardized residuals:
           Min           Q1          Med           Q3          Max
-2.762981486 -0.714569461  0.003983449  0.567028639  2.256164210

Residual standard error: 46.62148
Degrees of freedom: 140 total; 139 residual
> corMatrix(auto2$modelStruct$corStruct)[[5]]
       [,1] [,2] [,3] [,4]
[1,]    1    0    0    0
[2,]    0    1    0    0
[3,]    0    0    1    0
[4,]    0    0    0    1

My question is about this second output with wave2 as the time
indicator. I do not see why the Phi1 estimate has now changed to zero,
as if there would be no residual correlation at all between the time
points. I must admit that I was not sure which value to expect for phi1.

A related question is about using corAR1 in case the individuals in my
data would have been observed at different timepoints. E.g. person A at
day 1, 2, 3, 4 but person B at day 1 , 4, 5, 8. Can corAR1 still be used
given such unequally spaced data? Some documents on the internet say
that in such situation corCAR1 should be used? From Bates and Pinhiero's
book it seems to me that corCAR1 is applicable in case of continuously
code time points, like 1.7, 2.5 etc., while for integer coded time the
corAR1 method can still be used (this apart form the fact that corAR1
can also give negative phi values in contrast with corCAR1).

Thanks for any help!

Ben Pelzer.