[R-sig-Geo] kriging question
Edzer Pebesma
edzer.pebesma at uni-muenster.de
Tue Aug 26 22:39:55 CEST 2008
Dave,
Transformation to a continuous distribution when the data follow a
discrete distribution is always messy, and the back-transform may get worse.
While you're at the library, try to pick up Diggle & Ribeiro's
Model-based geostatistics; they describe a model-based approach that
extends glm models. It seems the most appropriate way for your kind of
data. I'm not sure whether the accompanying software (packages
geoR/geoRglm) supports zero-inflated Poissons. In case it does, it
remains to be seen whether prediction will actually improve substantially.
--
Edzer
Dave Depew wrote:
> Thanks Edzer,
>
> I've requested Cressie's book from our library (just waiting on it).
> My main concern was the many 0 counts. I also was not enthusiastic
> about odd transformations which then require appropriate
> back-transforms (I imagine the back transform of the kriging variance
> gets messy)
>
> I've tried several linear and non-linear combinations....they all do
> not improve on predictions generated by using OK with the
> untransformed data. I am confident that the resultant grid outputs do
> capture the spatial structure quite well. I've also tried a 10 fold
> cross validation of the kriging model - this seems to give reasonable
> estimates for mean error, mean squared prediction error and mean
> square normalized error. I had interpreted this that the variogram
> model chosen was doing a reasonable job.
>
> Edzer Pebesma wrote:
>> Hi Dave,
>>
>> Dave Depew wrote:
>>> Hi all,
>>> A question for the more experienced geostats users....
>>>
>>> I have a data set containing 2-3 variables relating to submerged
>>> plant characteristics inferred from acoustic survey.
>>> The distribution of the % cover variable is bounded (0-100) and
>>> highly left skewed (many 0's). The transect spacing is quite even,
>>> and I can't seem to notice much difference between a run of ordinary
>>> kriging and a variant of RK using a zeroinflated glm of the %cover
>>> residuals.
>>> None of the other co-variates show much correlation with the data
>>> (i.e. bottom depth, x and y). Is this a possible reason why OK and
>>> RK seem to give more or less the same predictions?
>> Well, yes, if there's not much of a trend, then RK will essentially
>> simplify to OK.
>>>
>>> my second question relates to transformation of the target
>>> variable...in this case zero inflated distributions are difficult to
>>> transform. Is it really a requirement of kriging that the data be
>>> transformed? or just that it will generally perform better with a
>>> target variable with a distribution close to normal?
>>>
>> I believe the argument is along the following lines: kriging is the
>> BLUP in any case, but in case the data are normally distributed
>> (around the trend), the BLUP (or more exactly the BLP, simple
>> kriging) coincides with the conditional expectation, making it the
>> best possible predictor. In other cases, meaning when data are not
>> normally distributed, it is still the best linear predictor, but it
>> may very well be that there are other, better, non-linear predictors
>> that give a result much closer to the best predictor under those
>> circumstances.
>>
>> If there is a transformation for that data that makes them
>> multivariate Gaussian, then transforming and kriging on that scale is
>> the way to go. A catch that has gotten very little attention is that
>> transformation typically looks at marginal distributions, and not at
>> multivariate distributions, the latter being pretty hard to check
>> with only one realisation of the random field.
>>
>> Cressie's book is a good source to read this stuff; I've lost my copy
>> when I moved jobs in the spring.
>> --
>> Edzer
>>
>
--
Edzer Pebesma
Institute for Geoinformatics (ifgi), University of Münster,
Weseler Straße 253, 48151 Münster, Germany. Phone: +49 251
8333081, Fax: +49 251 8339763 http://ifgi.uni-muenster.de/
More information about the R-sig-Geo
mailing list