[R-sig-Geo] gstat cross validation for accuracy ordinary kriging vs IDW

Sun May 25 10:17:51 CEST 2014

On Sun, 25 May 2014, rubenfcasal wrote:

> Hi Moshood,
>
>     I think your approach is wrong.
>
>     Cross-validation is a way to diagnose if your fitted
> model/procedure is correct (you cross-validate the model, not the
> predictions). You can use cross-validation to estimate the prediction
> error. The prediction error is the expected error when you use the
> model/method to predict the response on a new observation, one that was
> not used in estimating/fitting the model. It has not to be confused with
> the observed error (nor with other estimate that you can compute).
>
>     The idea behind CV is to repeatedly delete some of the data (one
> observation in leave-one-out cross-validation, the standard procedure)
> and use the remaining data to predict the deleted observations. Then the
> prediction error can be estimated from the differences between
> predictions and true values.
>
>     You can use krige.cv() for CV a kriging model. I think you will
> have to write the appropriate code (calling idw() repeatedly)  for
> inverse distance weighting.

I believe you can use gstat.cv() for an IDW model:

library(gstat)
library(sp)
data(meuse)
coordinates(meuse) <- ~x+y
IDW1n_fit <- gstat(id="IDW1n_fit", formula = log(zinc) ~ 1, data = meuse,
   nmax=12, set=list(idp=1))
pe1 <- gstat.cv(IDW1n_fit, debug.level=0, random=FALSE)
str(pe1) #to see what inside
sqrt(mean(pe1$residual^2)) # RMS prediction error

Then you see differences if you change for example nmax= or idp=.

Roger

>
> Best regards,
>     Ruben Fernandez-Casal
>
>
> El 21/05/2014 21:15, Moshood Agba Bakare escribi?:
>> Hi all,
>>
>> I have been having a couple of challenge with my analysis. I have
>> irregularly space spatial yield monitor data over four years. Pooling this
>> data together is not feasible because of misalignment. That is, the
>> coordinates of data point varies from one year to the other.
>> I  created a common regular interpolation grid for each year with the same
>> grid size of 10 x 10 m. I am able to get interpolated value for each point
>> using ordinary kriging and inverse distance weighting method (IDW). Please
>> how I cross validate this two interpolation methods to know which one give
>> me the best estimate.
>>
>> The problem I notice is that there is no observe value in each
>> interpolation point to assess the prediction accuracy of these methods.
>> Please what do I do? see my script below. I correlated the interpolated
>> values from the two methods. They are highly correlated (r=0.98). How do I
>> know which  method gave good prediction?
>>
>> grid <- expand.grid(easting=seq(from = 299678, to = 301299, by=10),
>>                      northing=seq(from = 5737278, to = 5738129, by=10))
>>
>> ## convert the grid to SpatialPixel class to indicate gridded spatial data
>> coordinates(grid)<-~easting+northing
>> proj4string(grid)<-CRS("+proj=utm +zone=12 +ellps=WGS84 +datum=WGS84
>> +units=m +no_defs +towgs84=0,0,0")
>>
>> gridded(grid)<- TRUE
>>
>>
>> #### Ordinary kriging to create kriging prediction
>> orkrig <- krige(yield ~ 1, canmod.sp, newdata = grid, model=exp.mod,nmax=20)
>>
>>
>>
>> ## Inverse Distance Weighting (IDW) Interpolation method maxdist=16.5
>> idw1 = idw(yield~1, canmod.sp, newdata=grid,nmax=20,idp=1)
>>
>> Thanks while looking forward to reading from you.
>>
>> 	[[alternative HTML version deleted]]
>>
>> _______________________________________________
>> R-sig-Geo mailing list
>> R-sig-Geo at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>>
>
>
> 	[[alternative HTML version deleted]]
>
>

-- 
Roger Bivand
Department of Economics, Norwegian School of Economics,
Helleveien 30, N-5045 Bergen, Norway.
voice: +47 55 95 93 55; fax +47 55 95 91 00
e-mail: Roger.Bivand at nhh.no