[R-sig-Geo] kriging

Thu Jul 3 20:02:36 CEST 2008

Hi All,
I had a look at the paper suggested by Edzer. Some of my data is of 
similar nature (% cover rather than direct counts; but many zeroes in 
the data set). It is difficult to assess the presence of trends with 
this variable due to 1) a limited number of covariates (i.e x,y, and 
depth) but also due to the nature of variation of the data (the 
distribution is quite patchy due to variations in substrate, exposure 
degree, etc). For example;

panel.hist <- function(x, ...)
 {
     usr <- par("usr"); on.exit(par(usr))
     par(usr = c(usr[1:2], 0, 1.5) )
     h <- hist(x, plot = FALSE)
     breaks <- h$breaks; nB <- length(breaks)
     y <- h$counts; y <- y/max(y)
     rect(breaks[-nB], 0, breaks[-1], y, col="cyan", ...)
 }
 pairs(fulmar, diag.panel=panel.hist)

shows that the variables of interest are highly left skewed for both 
coast and fulmar variables. I guess I'm confused as to why glm usage in 
this case was chosen as the trend predictor? Was it simply due to the 
limited covariates and the skewed distributions? If one looks as the 
residuals of the glm, they show some characteristics of non-homogeneity. 
Eg; plot(glm98)

Does the glm procedure use GLS to estimate the model as opposed to OLS? 
Please excuse my beginner level ignorance, but I was under the 
assumption that when there was a trend present, GLS should be used. I 
guess part of my confusion also stems from difficulty visualizing the 
"linear" model specified in the fulmar demo (i.e. fulmar ~ coast + 
depth)....there appears in the scatterplots to be only weak 
relationships between each of the variables, a case not unlike the data 
I am working with. I've tried variations of ordinary, and universal 
kriging and also "regression kriging" as discussed below...they all seem 
to give similar predictions of distribution and actual data values. I 
had avoided using the kriging variance due to my perhaps incorrect 
assumption that the variance computed on trend residuals was not optimal.

I guess it really depends on the way in which one models the data and 
any underlying trends...and that there does not appear to be a "gold 
standard" for data such as the fulmar data set?

Edzer Pebesma wrote:
> Hengl, T. wrote:
>> I agree with Paulo - gstat can work with any linear model including the transforms of the original predictors e.g.:
>>
>> Z ~ X + X^2 + Y + Y^2    etc.
>>
>> The problem is that gstat implements the so-called Kriging-with-external-trend algorithm to make predictions (see section 2.1 of my lecture notes), which is mathematically more elegant, but then it accepts only a family of linear models (and not GLMs, regreesion-trees etc.). I have been promoting the concept of regression-kriging (deterministic and stochastic predictions seperated), but we still did not implement it in any package so far.
>>   
> And I can see why, as there are quite a few problems still to solve 
> (afaik) ahead of you. When you cut the problem in two, do the 
> regression estimation and residual prediction in two separate 
> processes (often under different assumptions, e.g. wrt spatial 
> correlation) you ignore the correlation between the two. Finding a 
> prediction variance by naively adding the variances of the two 
> components e.g. does not yield zero variance at observation locations, 
> because a non-zero correlation is ignored. At other locations, this 
> correlation is also non-zero. Furthermore, if you cut the problem in 
> two for e.g. binomial or Poisson distributed cases, in this approach 
> you likely end up with negative predictions or predictions above one 
> for the binomial case.
>
> Does the paper you refer to (by yourself) give solutions to these two 
> problems?
>> You can at any time separate the predictions (e.g. krige only the residuals), but then gstat will not give you the regression-kriging variance, and you can not run geostatistical simulations.
>>   
> No, of course not, for the reasons mentioned above. The gstat approach 
> is: if you want to make a mess, please take responsibility for it by 
> yourself (and don't blame me--through the package). There is a paper I 
> did it with count data, though, which is
>
> E.J. Pebesma, R.N.M. Duin, P.A. Burrough, 2005. Mapping Sea Bird 
> Densities over the North Sea: Spatially Aggregated Estimates and 
> Temporal Changes. Environmetrics 16 
> <http://www3.interscience.wiley.com/cgi-bin/jissue/110577560>, (6), p 
> 573-587 <http://dx.doi.org/10.1002/env.723>.
>
> and (part of) the analysis is found in
>
> library(gstat)
> demo(fulmar)
>
> I'm also confused by this term "regression kriging". Would you claim 
> that the universal kriging/kriging with (one or more) external drifts 
> implemented by gstat is not regression kriging? Are you actually 
> working on a package that does do regression kriging as you define it?
> --
> Edzer
>
>> see also:
>> https://stat.ethz.ch/pipermail/r-sig-geo/2008-February/003174.html
>>
>>
>> All the best,
>>
>> Tom Hengl
>> http://spatial-analyst.net
>>
>> Hengl, T., 2007. A Practical Guide to Geostatistical Mapping of
>> Environmental Variables. EUR 22904 EN Scientific and Technical Research
>> series, Office for Official Publications of the European Communities,
>> Luxemburg, 143 pp.
>> http://bookshop.europa.eu/uri?target=EUB:NOTICE:LBNA22904:EN:HTML
>>
>>
>> -----Original Message-----
>> From: r-sig-geo-bounces at stat.math.ethz.ch on behalf of Dave Depew
>> Sent: Mon 6/16/2008 10:54 PM
>> To: Paulo Justiniano Ribeiro Jr
>> Cc: r-sig-geo at stat.math.ethz.ch
>> Subject: Re: [R-sig-Geo] kriging
>>  
>> Ok,
>> What about higher order polynomials? I have fitted one using a gam to 
>> the data which which helps to normalize the residuals, and reduce the 
>> variance of the residuals.
>> Is it simply a matter of plugging in the function into the gstat command 
>> line? Or is it simpler to krig the residuals and then add the trend back 
>> to the interpolated residual grid?
>>
>>
>> Paulo Justiniano Ribeiro Jr wrote:
>>   
>>> Dave,
>>>
>>> what is necessary for UK is a relation expressed by a linear model, not
>>> necessaraly a linear relation between the variables.
>>> e.g. you could have a second degree polinomial and still work within the
>>> scope of universal kriging.
>>>
>>>
>>> On Mon, 16 Jun 2008, Dave Depew wrote:
>>>
>>>   
>>>     
>>>> Hi all,
>>>> I have a data set that I would like to krige to interpolate between
>>>> transects. There is a non-linear trend between two of the variables...my
>>>> impression from reading the gstat help file is that there must be a
>>>> linear relationship between the data to use universal kriging?
>>>> Second, would a method of non-linear regression followed by modelling of
>>>> the residuals with a semivariogram be an appropriate solution?
>>>>
>>>> Thanks,
>>>>
>>>> Dave
>>>>
>>>> _______________________________________________
>>>> R-sig-Geo mailing list
>>>> R-sig-Geo at stat.math.ethz.ch
>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>>>>
>>>>     
>>>>       
>>> Paulo Justiniano Ribeiro Jr
>>> LEG (Laboratorio de Estatistica e Geoinformacao)
>>> Universidade Federal do Parana
>>> Caixa Postal 19.081
>>> CEP 81.531-990
>>> Curitiba, PR  -  Brasil
>>> Tel: (+55) 41 3361 3573
>>> Fax: (+55) 41 3361 3141
>>> e-mail: paulojus AT  ufpr  br
>>> http://www.leg.ufpr.br/~paulojus
>>>
>>>
>>>
>>>
>>>     
>>
>> _______________________________________________
>> R-sig-Geo mailing list
>> R-sig-Geo at stat.math.ethz.ch
>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>>
>>
>>
>> 	[[alternative HTML version deleted]]
>>
>> _______________________________________________
>> R-sig-Geo mailing list
>> R-sig-Geo at stat.math.ethz.ch
>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>>   
>
> -- 
> Edzer Pebesma
> Institute for Geoinformatics (IfGI)
> University of Münster
> http://ifgi.uni-muenster.de/
>