[R-sig-Geo] Large Prediction Variances with gstat

Edzer Pebesma edzer.pebesma at uni-muenster.de
Fri Mar 1 02:22:21 CET 2013


predict.lm gives prediction errors for the mean, where gstat gives 
prediction errors for single observations, identical to

 > summary(pred.variance) + lm.zn.pred$residual.scale^2
     Min.  1st Qu.   Median     Mean  3rd Qu.     Max.
0.224297 0.231328 0.233698 0.236858 0.240198 0.301258

If you ask gstat to make predictions for blocks, using linear 
regression, it predicts the mean instead of individual observations, 
which is equivalent to kriging with a pure nugget effect:

 > gstat.lm.zn.pred<-krige(log(zinc)~x+y+elev, dat.mod, 
newdata=dat.pred, block=1)
[ordinary or weighted least squares prediction]
 > summary(gstat.lm.zn.pred at data) #gstat results
    var1.pred        var1.var
  Min.   :4.061   Min.   :0.006409
  1st Qu.:4.974   1st Qu.:0.013438
  Median :5.335   Median :0.015809
  Mean   :5.303   Mean   :0.018970
  3rd Qu.:5.618   3rd Qu.:0.022309
  Max.   :6.262   Max.   :0.083373


On 02/28/2013 09:20 PM, Jesse Berman wrote:
> Hi All,
>
> First time post, so please excuse any omissions/confusion.  I am performing
> a series of prediction models using gstat and discovered that prediction
> variance of spatially dependent data with OLS models was larger than those
> of kriging models.  This is counter-intuitive to the assumption that
> treating spatially dependent data as IID will result in artificially
> shrunken prediction variances.  Anyhow, to better understand how gstat
> treats OLS predictions, I reproduced an OLS prediction with the base package
> ('predict' and 'predict.lm') and found that while I got identical beta's, I
> got substantially higher variances with gstat.
>
> Can anyone shed some light as to why gstat might be giving these larger
> prediction variances when performing an OLS model?  (see reproducible
> example below)
>
> Regards and thanks for the help,
> Jesse
>
> library(gstat)
> data(meuse)
> coordinates(meuse) = ~x+y
> meuse.ns<-as.data.frame(meuse) #non-spatial Meuse data
>
> #Data sets for modeling and prediction; spatial and non-spatial
> dat.mod<-meuse[1:100,]
> dat.pred<-meuse[101:155,]
> dat.ns.mod<-meuse.ns[1:100,]
> dat.ns.pred<-meuse.ns[101:155,]
>
> #Linear Model Prediction with base package
> lm.zn<-lm(log(zinc)~x+y+elev, data=dat.mod)
> lm.zn.pred<-predict(lm.zn, dat.pred, se.fit=TRUE)
> pred.variance<-(lm.zn.pred$se.fit)^2
>
> #Linear Model Prediction with gstat
> gstat.lm.zn.pred<-krige(log(zinc)~x+y+elev, dat.mod, newdata=dat.pred)
>
> #Compare Results
> summary(gstat.lm.zn.pred at data) #gstat results
> summary(lm.zn.pred$fit)
> summary(pred.variance) #base model prediction variance
>
>
>
>
> --
> View this message in context: http://r-sig-geo.2731867.n2.nabble.com/Large-Prediction-Variances-with-gstat-tp7582882.html
> Sent from the R-sig-geo mailing list archive at Nabble.com.
>
> _______________________________________________
> R-sig-Geo mailing list
> R-sig-Geo at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>

-- 
Edzer Pebesma
Institute for Geoinformatics (ifgi), University of Münster
Weseler Straße 253, 48151 Münster, Germany. Phone: +49 251
8333081, Fax: +49 251 8339763  http://ifgi.uni-muenster.de
http://www.52north.org/geostatistics      e.pebesma at wwu.de



More information about the R-sig-Geo mailing list