[R-sig-Geo] Correlation between covariates and intercept (spatstat)

Adrian Baddeley adrian.baddeley at curtin.edu.au
Sun Apr 17 12:36:31 CEST 2016


Virginia Morera Pujol <morera.virginia at gmail.com> writes:

> In trying a spatial model with spatstat I am running into a conceptual
> problem. It might be more of a general modelling doubt than a specific
> spatial problem, but I hope someone can help.

> I am running a ppm() model that includes two covariates (as pixel images),
> one is primary productivity at sea, and the other is distance to a point
> that is not included in the pattern window. That means there is no 0 value,
> the range of values goes from 400 to 1400 approx.  When I run the model and
> look at the var-covar matrix using 'vcov(model, what = "corr")' , there is
> a very strong correlation (around -0.85) between the intercept and this
> covariate. I am not sure that this is a problem, but [...]

This is about the correlation between *estimates* of the model coefficients - in this case, the correlation between the estimated intercept and the estimated coefficient of the distance covariate. Extremely high correlations could cause problems with the identifiability of the model, but this is probably not a problem here. Moderately high correlations suggest that the t-tests for individual parameters (given in the printout for the model) are not independent. If we want to select the 'significant' covariates, we shouldn't use the model printout to discard more than one variable at a time. 

>  I have tried a couple of things just in case:

> 1/ centering the covariate values around the mean just changes the sign of
> the correlation (from -0.85 to +0.85 approx).

> 2/ normalizing the covariate values, so the values go from 0 to 1 makes the
> correlation between this covariate and the intercept almost 1 (0.99) It
> also makes the effect of this covariate three orders of magnitude higher
> than the effect of the other covariate, which didn't happen before and was
> not expected from the data.

Such transformations will change the correlation. Roughly speaking, that's because when you add a constant to the distance covariate, you are adding a multiple of the intercept onto the covariate. 

When you say the 'effect' of the covariate has increased, do you mean the coefficient of the covariate has increased, or the *effect term* (= coefficient x covariate value) has increased? I'd be surprised if this happens - the models should be equivalent as regards their fitted intensity, etc.

Adrian Baddeley


Prof Adrian Baddeley DSc FAA
Department of Mathematics and Statistics
Curtin University, Perth, Western Australia



More information about the R-sig-Geo mailing list