[R-sig-Geo] Kriging with uncertain data

Tue Oct 11 12:34:58 CEST 2016

Hi,

Thank you for your suggestions. I gave the gstat with weights argument a try, but I’m not sure about the results / my implementation.

I have worked out an example with the meuse dataset so I can share it with you:

library(gstat)
library(sp)

data(meuse)
coordinates(meuse) <- ~x+y

data(meuse.grid)
gridded(meuse.grid) = ~x+y

For a starting, let’s fit a standard universal kriging model:

# model 1: UK

v <- variogram(log(zinc)~sqrt(dist)+x+y, meuse)
var1 <- fit.variogram(v, vgm(1, "Sph", 700, 1))

m1 <- gstat(formula=log(zinc)~sqrt(dist), data=meuse, id='log_zinc',
            model=var1)
predm1 <- predict(m1, meuse.grid)

summary(predm1 at data)

Let’s now simulate some measurement variances. I will assume the standard error to be uniformly distributed, with values between 0.2 and 0.4 times the measured values: 

meuse$var <- (meuse$zinc*runif(155,0.1,0.2))^2

plot(meuse$var~meuse$zinc)
plot(log(meuse$var)~log(meuse$zinc))

Let’s now fit a weighted UK model, using measurement precisions (= 1/variance) as weights:

# model 2: weighted UK

m2 <- update(m1, weights=1/log(meuse$var))
predm2 <- predict(m2, meuse.grid)

If we compare the results of both models, we shall see that the variability of the predictions has shrinked a bit in model 2, while the kriging variances have increased:

summary(predm1 at data)
summary(predm2 at data)

The spatial distribution of the variances has also changed, reflecting the spatial distribution of the measurement variances.

All good so far: the uncertainty of the measurements has influenced the kriging model, and now we obtain slightly less certain krigged values. But this does not answer my question, since we have not propagated the measurement uncertainty to the estimated field. You can see this by comparing the magnitudes of model 2 kriging variances to the measurement variances:

summary(log(meuse$var))
summary(predm2 at data$log_zinc.var)

Ákos’ suggestion, i.e. to krige the measured values and their variances separately, is a good suggestion. We can still use the weights argument to krige the observations, and we shall get separated measurement and kriging variances, which is nice. If we suspect that there is a relation between the measured values and their errors (as it is the case in my simulated example), we could even use the trigged surface of the measurement as a coverable; I suppose. 

What do you think about that approached? Does it make sense, is it statistically correct regarding the kriging assumptions?

Also, I am a bit worried about the semivariogram model fitted, since there is no way that I have found to incorporate the error variances of the measurements. There is an Err argument to the vim function in gstat, but as I understand it a single error variance for the whole spatial field is expected, i.e. it does not work with spatially varying errors.

Cheers,

Santiago

> El 7 oct 2016, a las 10:37, Edzer Pebesma <edzer.pebesma at uni-muenster.de> escribió:
> 
> If the only problem is to krige these data, the solution is pretty
> trivial; add a location specific value to the nugget; this is what
> Delhomme in 1978 coined as regression kriging [1] (kriging of regressed
> rather than observed values, using estimates + estimation errors).
> 
> An implementation is found in gstat, look up argument "weights" in
> ?gstat; you can use this argument in gstat::krige
> 
> Trickier is to infer the variogram of the underlying, unobserved
> stationary variable from your estimates + estimation errors, in
> particular when these estimation errors are rather large and/or vary
> strongly. Anyone knows a good ref to a paper that tackles that issue?
> 
> 
> [1] Delhomme, J. P. "Kriging in the hydrosciences." Advances in water
> resources 1.5 (1978): 251-266.
> 
> On 07/10/16 10:27, Santiago Beguería wrote:
>> Dear Ákos,
>> 
>> I was referring to the former: I have data with two values at each location: measured value and uncertainty of the measurement. So, each observation is in fact a statistical variate, which we can assume is Gaussian distributed. Hence, my two values are the expected (mean) and the variance of the distribution.
>> 
>> Cheers,
>> 
>> Stg
>> 
>> 
>>> El 7 oct 2016, a las 8:39, Bede-Fazekas Ákos <bfalevlist at gmail.com> escribió:
>>> 
>>> Dear Santiago,
>>> 
>>> you mean you have two values at each location (observed value and uncertainty)? Or you have an observed value that is the sum of the real value and the observation error (uncertainty). If the last, then I think using the gstat::krige() function is straightforward, since the result of the function contains the variance of the prediction ("Attributes columns contain prediction and
>>> prediction variance"; https://cran.r-project.org/web/packages/gstat/gstat.pdf).
>>> 
>>> HTH,
>>> Ákos Bede-Fazekas
>>> Hungarian Academy of Sciences
>>> 
>>> 
>>> 
>>> 2016.10.06. 11:52 keltezéssel, Santiago Beguería írta:
>>>> Dear R-sig-geo list members,
>>>> 
>>>> I am curious about what are sensible approaches to spatial interpolation, most especially by using kriging, in the context of uncertain data.
>>>> 
>>>> Suppose one has a dataset of values observed at different locations, and each value consists on the expected value and its variance. Variance here represents the uncertainty related to the observation, and shows spatial variation due to external factors, for instance the geological setting affecting the quality of the measurement.
>>>> 
>>>> How would you proceed to model the spatial distribution of this variable, including propagation of the (spatially varying)?
>>>> 
>>>> I suppose one approach could be by simulation, but at there other ways of propagating the uncertainty that do not involve potentially expensive (in computation time) simulation approaches?
>>>> 
>>>> Cheers,
>>>> 
>>>> Santiago Beguería
>>>> CSIC
>>>> Spain
>>>> 
>>>> _______________________________________________
>>>> R-sig-Geo mailing list
>>>> R-sig-Geo at r-project.org
>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>>> 
>>> _______________________________________________
>>> R-sig-Geo mailing list
>>> R-sig-Geo at r-project.org
>>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>> 
>> _______________________________________________
>> R-sig-Geo mailing list
>> R-sig-Geo at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>> 
> 
> -- 
> Edzer Pebesma
> Institute for Geoinformatics  (ifgi),  University of Münster
> Heisenbergstraße 2, 48149 Münster, Germany; +49 251 83 33081
> Journal of Statistical Software:   http://www.jstatsoft.org/
> Computers & Geosciences:   http://elsevier.com/locate/cageo/
> 
> _______________________________________________
> R-sig-Geo mailing list
> R-sig-Geo at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo

	[[alternative HTML version deleted]]