[R-sig-Geo] kriging question

Tue Aug 26 22:39:55 CEST 2008

Dave,

Transformation to a continuous distribution when the data follow a 
discrete distribution is always messy, and the back-transform may get worse.

While you're at the library, try to pick up Diggle & Ribeiro's 
Model-based geostatistics; they describe a model-based approach that 
extends glm models. It seems the most appropriate way for your kind of 
data. I'm not sure whether the accompanying software (packages 
geoR/geoRglm) supports zero-inflated Poissons. In case it does, it 
remains to be seen whether prediction will actually improve substantially.
--
Edzer

Dave Depew wrote:
> Thanks Edzer,
>
> I've requested Cressie's book from our library (just waiting on it).
> My main concern was the many 0 counts. I also was not enthusiastic 
> about odd transformations which then require appropriate 
> back-transforms (I imagine the back transform of the kriging variance 
> gets messy)
>
> I've tried several linear and non-linear combinations....they all do 
> not improve on predictions generated by using OK with the 
> untransformed data. I am confident that the resultant grid outputs do 
> capture the spatial structure quite well. I've also tried a 10 fold 
> cross validation of the kriging model - this seems to give reasonable 
> estimates for mean error, mean squared prediction error and mean 
> square normalized error. I had interpreted this that the variogram 
> model chosen was doing a reasonable job.
>
> Edzer Pebesma wrote:
>> Hi Dave,
>>
>> Dave Depew wrote:
>>> Hi all,
>>> A question for the more experienced geostats users....
>>>
>>> I have a data set containing 2-3 variables relating to submerged 
>>> plant characteristics inferred from acoustic survey.
>>> The distribution of the % cover variable is bounded (0-100) and 
>>> highly left skewed (many 0's). The transect spacing is quite even, 
>>> and I can't seem to notice much difference between a run of ordinary 
>>> kriging and a variant of RK using a zeroinflated glm of the %cover 
>>> residuals.
>>> None of the other co-variates show much correlation with the data 
>>> (i.e. bottom depth, x and y). Is this a possible reason why OK and 
>>> RK seem to give more or less the same predictions?
>> Well, yes, if there's not much of a trend, then RK will essentially 
>> simplify to OK.
>>>
>>> my second question relates to transformation of the target 
>>> variable...in this case zero inflated distributions are difficult to 
>>> transform. Is it really a requirement of kriging that the data be 
>>> transformed? or just that it will generally perform better with a 
>>> target variable with a distribution close to normal?
>>>
>> I believe the argument is along the following lines: kriging is the 
>> BLUP in any case, but in case the data are normally distributed 
>> (around the trend), the BLUP (or more exactly the BLP, simple 
>> kriging) coincides with the conditional expectation, making it the 
>> best possible predictor. In other cases, meaning when data are not 
>> normally distributed, it is still the best linear predictor, but it 
>> may very well be that there are other, better, non-linear predictors 
>> that give a result much closer to the best predictor under those 
>> circumstances.
>>
>> If there is a transformation for that data that makes them 
>> multivariate Gaussian, then transforming and kriging on that scale is 
>> the way to go. A catch that has gotten very little attention is that 
>> transformation typically looks at marginal distributions, and not at 
>> multivariate distributions, the latter being pretty hard to check 
>> with only one realisation of the random field.
>>
>> Cressie's book is a good source to read this stuff; I've lost my copy 
>> when I moved jobs in the spring.
>> -- 
>> Edzer
>>
>

-- 
Edzer Pebesma
Institute for Geoinformatics (ifgi), University of Münster,
Weseler Straße 253, 48151 Münster, Germany.  Phone: +49 251
8333081, Fax: +49 251 8339763  http://ifgi.uni-muenster.de/