[R] Coefficient of determination in a regression model with AR(1) residuals

Thu Apr 24 07:57:53 CEST 2008

Dear R-users,

I used lm() to fit a standard linear regression model to a given data  
set, which  led to a coefficient of determination (R^2) of about  
0.96. After checking the residuals I realized that they follow an  
autoregressive process (AR) of order 1 (and therefore contradicting  
the i.i.d. assumption of the regression model). I then used gls()  
[library nlme] to fit a linear regression model with AR(1)-residuals.  
The residuals look perfect (residual plot, ACF, PACF, QQPlot, Ljung- 
Box test).
As mentioned on http://en.wikipedia.org/wiki/ 
Coefficient_of_determination (citation [2008-04-24]: "For cases other  
than fitting by ordinary least squares, the R^2 statistic can be  
calculated as above" and later: "Values for R^2 can be calculated for  
any type of predictive model"), I tried to calculate the standard R^2  
for the model with AR(1) residuals. However, I ended up with R^2  
larger than 1!
As mentioned on the German wikipedia page (http://de.wikipedia.org/ 
wiki/Bestimmtheitsmaß), in models fitted using Maximum Likelihood  
Estimation (MLE), the coefficient of determination does _not_ exist  
(citation [2008-04-24]: "Bei bestimmten statistischen Modellen, z.B.  
bei Maximum-Likelihood-Schätzungen, existiert das Bestimmtheitsmaß  
R^2 nicht"). Any comments on that?

The German Wikipedia page mentions McFadden's pseudo-coefficient of  
determination, the English Wikipedia page the one of Nagelkerke. I  
know there are others, too. Is there a general agreement on which  
"coefficient of determination" (or goodness-of-fit measure in  
general) to use for a regression model with autocorrelated errors? Is  
there a possibility to compare (non-graphically) the standard  
regression model with the model with AR(1) residuals to justify the  
better fit of the latter?

Any comments are appreciated.

Best regards.

Marius