[R-sig-Geo] Large Prediction Variances with gstat
Edzer Pebesma
edzer.pebesma at uni-muenster.de
Fri Mar 1 02:22:21 CET 2013
predict.lm gives prediction errors for the mean, where gstat gives
prediction errors for single observations, identical to
> summary(pred.variance) + lm.zn.pred$residual.scale^2
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.224297 0.231328 0.233698 0.236858 0.240198 0.301258
If you ask gstat to make predictions for blocks, using linear
regression, it predicts the mean instead of individual observations,
which is equivalent to kriging with a pure nugget effect:
> gstat.lm.zn.pred<-krige(log(zinc)~x+y+elev, dat.mod,
newdata=dat.pred, block=1)
[ordinary or weighted least squares prediction]
> summary(gstat.lm.zn.pred at data) #gstat results
var1.pred var1.var
Min. :4.061 Min. :0.006409
1st Qu.:4.974 1st Qu.:0.013438
Median :5.335 Median :0.015809
Mean :5.303 Mean :0.018970
3rd Qu.:5.618 3rd Qu.:0.022309
Max. :6.262 Max. :0.083373
On 02/28/2013 09:20 PM, Jesse Berman wrote:
> Hi All,
>
> First time post, so please excuse any omissions/confusion. I am performing
> a series of prediction models using gstat and discovered that prediction
> variance of spatially dependent data with OLS models was larger than those
> of kriging models. This is counter-intuitive to the assumption that
> treating spatially dependent data as IID will result in artificially
> shrunken prediction variances. Anyhow, to better understand how gstat
> treats OLS predictions, I reproduced an OLS prediction with the base package
> ('predict' and 'predict.lm') and found that while I got identical beta's, I
> got substantially higher variances with gstat.
>
> Can anyone shed some light as to why gstat might be giving these larger
> prediction variances when performing an OLS model? (see reproducible
> example below)
>
> Regards and thanks for the help,
> Jesse
>
> library(gstat)
> data(meuse)
> coordinates(meuse) = ~x+y
> meuse.ns<-as.data.frame(meuse) #non-spatial Meuse data
>
> #Data sets for modeling and prediction; spatial and non-spatial
> dat.mod<-meuse[1:100,]
> dat.pred<-meuse[101:155,]
> dat.ns.mod<-meuse.ns[1:100,]
> dat.ns.pred<-meuse.ns[101:155,]
>
> #Linear Model Prediction with base package
> lm.zn<-lm(log(zinc)~x+y+elev, data=dat.mod)
> lm.zn.pred<-predict(lm.zn, dat.pred, se.fit=TRUE)
> pred.variance<-(lm.zn.pred$se.fit)^2
>
> #Linear Model Prediction with gstat
> gstat.lm.zn.pred<-krige(log(zinc)~x+y+elev, dat.mod, newdata=dat.pred)
>
> #Compare Results
> summary(gstat.lm.zn.pred at data) #gstat results
> summary(lm.zn.pred$fit)
> summary(pred.variance) #base model prediction variance
>
>
>
>
> --
> View this message in context: http://r-sig-geo.2731867.n2.nabble.com/Large-Prediction-Variances-with-gstat-tp7582882.html
> Sent from the R-sig-geo mailing list archive at Nabble.com.
>
> _______________________________________________
> R-sig-Geo mailing list
> R-sig-Geo at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>
--
Edzer Pebesma
Institute for Geoinformatics (ifgi), University of Münster
Weseler Straße 253, 48151 Münster, Germany. Phone: +49 251
8333081, Fax: +49 251 8339763 http://ifgi.uni-muenster.de
http://www.52north.org/geostatistics e.pebesma at wwu.de
More information about the R-sig-Geo
mailing list