[R-sig-Geo] Best distance for a semi-variogram calculation

Tue Aug 17 18:18:58 CEST 2010

> Is there a methodology to find the best distance to be used as a limit
> for the calculation of a semi-variogram ?

General practitioner rule: the half of the squared domain diagonal

> Here is an example that is problematic for me :
> I have a 300 km by 300 km grid regularly spaced points containing
> water current speed.
> I first tested for spatial trend making a linear model between current
> speed value and coordinates.

That is fine. Maybe you need to remove not a linear trend, but a
polynomial one instead (that is a linear model of a combination of the
higher order of the coordinates).

> As there were a high effect, I removed this trend making my variogram
> on the residuals of the model.
> I calculated the semi-variogram on my variable from 0 to 300 km.

Yours is a very easy example :)

  > sqrt(2)*300/2
  [1] 212.1320

That should be the maximum lag distance at which it makes sense to
calculate the experimental variogram.

> The variance first increase, attain a local maximum then decrease, and
> re-increase acquiring a wave form, never really converging to a sill
> value ...

> 1) do you have ever met such a pattern ? And if yes what was the reason of it ?

If I understand correctly the form of the empirical semi-variogram,
this can be an "hole effect", which you get for example in stratified
or periodic fields.
Are you calculating directional variograms, or isotropic? Look at the
image() or contour plot of the residuals, do you see stratifications
(= bands) or evident anisotropies?
Remember the special case of linear model of variogram: it is
unbounded, never reaching a sill. It actually means that the variance
of underlying random function is not stationary, but varies in space;
in this case the covariance of the RF is not defined, and this is not
an order-2 stationary function, but a IRF-0.

> 2) I agree that value of points located at 300 km from each other are
> certainly not linked but how to know if there is no sill in the
> semi-variogram ? Do I have to consider the distance where the first
> decrease occurred in the semi-variance plot as the limit of the range
> ?

Inferring the variogram model from an experimental one can be tricky.
As I mentioned, try directional variograms. You can also try to play
around with the number of lags, as in seq(0,212,length=12) and
seq(0,212,length=8), for example.

> 3) With that kind of shape what kind of variogram model would you
> recommend to fit on it ?

"The spherical model is the geostatistician's best friend" :)
First try to observe the features of your field, then try to model
those features after opportune manipulation (trend filtering,
considering anisotropy and so on). Only after those step you can think
of "complicated" variogram model, like the nested
linear+hole+nugget+spherical.

Good luck,

Scion