[R-sig-Geo] Fitting nested variograms to empirical variograms
Ole F. Christensen
olefc at daimi.au.dk
Sat Oct 16 14:14:42 CEST 2004
Dear Edzer
Thanks for your comment. I don't think we strongly disagree on these
matters. I hope this response clarifies my current view point.
* I certainly don't want to debunk the empirical variogram. I find it
very useful as an exploratory tool. For example, the emperical variogram
might reveal pseudo-periodicity in the data and it might reveal
directional effects. For some projects there is also the questions
whether there is actually any spatial structure in the data, which a
variogram plot of residuals [or standardised residuals if you having a
GLM model] would reveal. Also plotting the empirical variogram might
reveal if something has gone wrong when fitting by m.l.e.
My recommandation : "Always plot the empirical variogram [of
standardised residuals]"
* I agree that the micro-scale variation component may be an important
component. Since the data does not contain any information about whether
a non-spatial component is part of the signal of interest or just random
noise then the user has to specify this himself. This is an issue no
matter what inference-machinery you are using [m.l.e. or fitting to
variograms].
I can't see we disagree about anything here [and if you see my paper in
the september 2004 issue of Journal of computational and Graphical
Statistics, then there is a discussion about micro-scale issues for
likelihood inference in a spatial Poisson model].
* Nested variogram models. My objection to them is based on what I have
sometimes seen : a very elaborate fitting to empirical variograms, where
a lot of effort is going into fitting the variogram away from the
origin, and where the number of variogram models used in the nested
structure seems to decided by this fitting to the empirical variogram in
mind.
A nested model for the variogram really says that the phenomenon we are
modelling is Y(x) = Y_1(x) + Y_2(x) + Y_3(x) + Y_4(x) etc. , where the
different components have different spatial structure.
Rather than letting the empirical variogram decide the number of
components, then shouldn't we start thinking about at the data
generating mechanisms instead ?
When having more than one spatial component Y_i(x), shouldn't we attempt
interpreting the different components ?
How about the implicit additivity assumption of the components when
using a nested model ? [The data generating mechanism may suggest
otherwise ... ].
A blind use of nested variogram models seems silly to me.
* Fitting a nested variogram model. In case you want to use such a
model, then you may fit the parameters by maximum likelihood, which was
one point I tried to make in my previous mail. I see now that I may have
stressed that point a bit too hard.
I expect that a procedure for finding the maximum of the likelihood, for
some data sets might have convergence problems due to identifiability
problems of parameters. So probably good starting values are needed, but
from your previous e-mail I see that there seems to be a similar issue
for fitting to variograms. As you wrote in your previous e-mail, good
starting values can be found by fitting a nested model by eye. I also
have to admit, that currently there seems to be no procedure available
in packages in R for fitting nested variogram models using maximum
likelihood [so we are lacking behind in that respect].
* Using the likelihood function : A certain type of books and papers
about geostatistics may have emphasised the likelihood function too
strongly.
Being brought up as a statistician, then using the likelihood function
for inference is the natural thing to me. But I have also been taught to
be be careful about the model.
A model should catch the important structure of the data [here you need
input from subject matter people]. Considering and investigating the
structure of a model in many aspect is where we should spend our time.
I give my applaud to the final sentence in your e-mail ``Geostatistics
is about modelling what's out there."
* Last comment : Your suggested comparison (ML without nested vs.
nested models, traditionally fit) is missing the point entirely, since
such a comparison would be a comparison of two different models, rather
than two procedures for inference.
Best regards
Ole
Edzer J. Pebesma wrote:
>
>
> Ole F. Christensen wrote:
>
>>
>> I have no experience fitting nested variogram models myself, but my
>> general opinion is that nested variograms aren't really useful, since
>> what matters the most is
>> to make a good fit of the empirical variogram near the origin. And if
>> one really wants to make a very careful fit of a variogram-model to
>> the data, then the likelihood function should be used rather than
>> fitting to the empirical variogram.
>
>
> This reasoning has been put forward in the 1999 book by Michael Stein
> which
> contains besides this one a few very provocative statements, such as
> "forget about
> sample variograms, only look at likelyhood profiles". Although I like
> the book,
> the problem I have with it is that it contains hardly any analysis of
> real data. The
> argument therefore is based on theory; mathematicians do that, and
> they may prove
> right.
>
> However, nested variograms have been very useful in the past,
> especially for
> describing spatial variability in larger data sets. There are
> theoretical arguments
> for using them, think e.g. of the nugget effect: it consists of
> measurement error
> (a "true" nugget effect) and spatially correlated microvariation: a
> nested variogram
> model with a range so small that it's usually not detected by the
> data; see
> Cressie (1993) for more on this. Given it's not in the data, ML or
> REML will never pick
> it up, it's only something you can (and should) impose when you know for
> instance the true measurement error from other sources than the
> observed data.
>
> I would like to see papers where both approaches (ML without nested vs.
> nested models, traditionally fit) were compared with large data sets;
> I find
> it hard to embrace theoretical ideas without having them seen work in
> practice.
>
> Geostatistics is about modelling what's out there.
> --
> Edzer
>
> _______________________________________________
> R-sig-Geo mailing list
> R-sig-Geo at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>
>
--
Ole F. Christensen
BiRC - Bioinformatics Research Center
University of Aarhus
More information about the R-sig-Geo
mailing list