[R-sig-eco] Question about gls output

Drew Tyre atyre2 at unl.edu
Fri Jun 29 19:28:12 CEST 2012


Basil - you should do it for exactly the reason they state:
On page 138 of Zuur et al. (2009) they make the following statement
about a fitted model:

…there is a high correlation between the intercept and the slope for
arrival. This is because all arrival values are between 22 and 30 …
The intercept is the value of the response if all explanatory values
are 0 … which is obviously far outside the range of the sampled
arrival time variables. A small change in the slope can therefore have
a large change in the intercept, hence the high correlation. It would
be better to center arrival time around 0 and refit all the models.

To demonstrate, lets simulate some data using the parameters from the
fitted model:

set.seed(483709)
X = runif(100, 22, 30)
Y = 1.18 - 0.03 * X + rnorm(100, 0, 0.3)
head(cbind(X, Y))
##          X       Y
## [1,] 23.07 0.39927
## [2,] 25.73 0.02561
## [3,] 23.05 0.46457
## [4,] 26.87 0.47211
## [5,] 26.98 0.26696
## [6,] 28.76 0.50536
test.lm = lm(Y ~ X)
summary(test.lm, correlation = TRUE)
##
## Call:
## lm(formula = Y ~ X)
##
## Residuals:
##     Min      1Q  Median      3Q     Max
## -0.7461 -0.1892 -0.0008  0.1733  0.6620
##
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)
## (Intercept)   1.0661     0.3124    3.41  0.00094 ***
## X            -0.0247     0.0121   -2.04  0.04443 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.282 on 98 degrees of freedom
## Multiple R-squared: 0.0406,  Adjusted R-squared: 0.0308
## F-statistic: 4.15 on 1 and 98 DF,  p-value: 0.0444
##
## Correlation of Coefficients:
##   (Intercept)
## X -1.00
##
plot(Y ~ X)
abline(test.lm)

Notice that the correlation between the intercept and the slope is
extremely high - couldn't be higher, in fact! This does not affect the
quality of the estimate of the intercept. So now lets try the
suggested fix and see what happens.

CX = scale(X, center = TRUE, scale = FALSE)
test.Clm = lm(Y ~ CX)
summary(test.Clm, correlation = TRUE)
##
## Call:
## lm(formula = Y ~ CX)
##
## Residuals:
##     Min      1Q  Median      3Q     Max
## -0.7461 -0.1892 -0.0008  0.1733  0.6620
##
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)
## (Intercept)   0.4327     0.0282   15.36   <2e-16 ***
## CX           -0.0247     0.0121   -2.04    0.044 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.282 on 98 degrees of freedom
## Multiple R-squared: 0.0406,  Adjusted R-squared: 0.0308
## F-statistic: 4.15 on 1 and 98 DF,  p-value: 0.0444
##
## Correlation of Coefficients:
##    (Intercept)
## CX 0.00
##
So the estimate of the intercept is now much more accurate (compare
the t-values), while the estimate of the slope has gotten a bit worse.
Moreover, the correlation is now about as small as it could possibly
be. So if you want an accurate estimate of the intercept, you should
center your data.

On Thu, Jun 28, 2012 at 1:32 PM, Basil Iannone <bianno2 at uic.edu> wrote:
> Dear R users,
>
> First, I apologize. This question is not code related, but I figured I
> should ask it here given the large amount of questions pertaining to lme
> and gls models that I see posted.
>
> I am analyzing the effects of two variables (LDH and SI3) on total soil C.
> The model that I am using for this analysis is a gls model, which is below:
>
> m2<-gls(C~LDH + SI3, method = "ML", weights = vpDist)
>
> What I want to know is, can anyone guide me to a reference where I can read
> about, understand, and interpret the correlations given in the gls model
> output for correlations among parameter estimates? I have read through
> Pinheiro and Bates and Zuur (2009)'s books on mixed effects models (which
> also cover gls models), and searched the web, but I have found very little
> on this part of the model output.
>
> I ask, because I want to know if I should be concerned by a high
> correlation value between two parameters (e.g. between LDH and Intercept ~
> 0.92; see below). Zuur (2009) on page 138 gives an example of a lme model
> with a high correlation value between an explanatory value and the
> intercept and then suggests centering the explanatory value by subtracting
> the mean of the explanatory value from all individual values for this term.
> I, however, am unclear as to why one should do this, particularly if
> someone is interested in generating a reliable estimate for the intercept.
>
> Thanks again for any suggestions that anyone has for me.
>
> The output for my model is:
>
>> summary(m2)
> Generalized least squares fit by maximum likelihood
>  Model: C ~ LDH + SI3
>  Data: NULL
>       AIC      BIC    logLik
>  440.6114 453.4332 -215.3057
>
> Variance function:
>  Structure: Power of variance covariate
>  Formula: ~DistHyp
>  Parameter estimates:
>     power
> -0.7507717
>
> Coefficients:
>                Value Std.Error   t-value p-value
> (Intercept)  5.515769 2.3318502  2.365404  0.0201
> LDH         -1.671476 0.5551217 -3.011007  0.0034
> SI3          0.096973 0.0319332  3.036747  0.0031
>
>  Correlation:
>    (Intr) LDH
> LDH -0.924
> SI3 -0.161 -0.217
>
> Standardized residuals:
>        Min          Q1         Med          Q3         Max
> -2.68192339 -0.67510193 -0.05998944  0.56286828  2.55012832
>
> Residual standard error: 45.15951
> Degrees of freedom: 96 total; 93 residual
>
> --
> Basil Iannone
> University of Illinois at Chicago
> Department of Biological Sciences (MC 066)
> 845 W. Taylor St.
> Chicago, IL  60607-7060
> Email: bianno2 at uic.edu
> Phone: 312-355-3231
> Fax: 312-413-2435
> http://www2.uic.edu/~bianno2
>
>        [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-ecology mailing list
> R-sig-ecology at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>

-- 
Drew Tyre

School of Natural Resources
University of Nebraska-Lincoln
416 Hardin Hall, East Campus
3310 Holdrege Street
Lincoln, NE 68583-0974

phone: +1 402 472 4054
fax: +1 402 472 2946
email: atyre2 at unl.edu
http://snr.unl.edu/tyre
http://aminpractice.blogspot.com
http://www.flickr.com/photos/atiretoo



More information about the R-sig-ecology mailing list