[R-sig-Geo] regression kriging, newdata parameter

Tomislav Hengl hengl at spatial-analyst.net
Thu Feb 24 13:53:59 CET 2011


So you have values of 'sm' (target) and 'res' (covariate) at all 
sampling locations. Now you would like to predict at some other 
locations? You obviously have to have values of 'res' at those locations 
(This is why you get an error in gstat)

"krige" function needs either a SpatialGrifDataFrame or a 
SpatialPointsDataFrame, so the prediction locations do not necessarily 
have to be on a regular grid. The issue is that covariates need to be 
available at prediction locations.

Technically speaking, you could also estimate values of predictors using 
some kind of resampling method (e.g. splines), but then you would use 
estimated values (covariate) to estimate target variable, which does 
feel a bit awkward (but is doable).

PS: gstat implements the Kriging with External Drift method 
(http://dx.doi.org/10.1111/j.1467-9671.2006.01015.x), but this should be 
equivalent to RK (regression and kriging separately, then sum the maps).

HTH

T. Hengl
http://www.wewur.wur.nl/popups/vcard.aspx?id=HENGL001

Op 23-2-2011 20:51, giuseppe calamita schreef:
> Hi,
> I want to perform spatial interpolation of soil moisture (here named 'sm')
> using regression kriging (RK) following the example given by Tom Hengl
> http://spatial-analyst.net/wiki/index.php?title=Spatial_prediction_of_soil_moisture)
>   and compare results with OK.
>
> The covariate is the soil electrical resistivity (her named 'res') and is stored
> in the same data.frame of the target variable.
> sm and res data were collected on a regular grid 200x60 m with a 5m sampling
> step and the coordinates are given in ED50, UTM zone33. CRS.
> Searching on the sig-Geo archive i discovered  that other peoples had more or
> less the same problem as mine, but there are some differences.
> All the examples i read used the meuse.grid data set which  already stores
> covariates (dist, soils, ffreq...) and coordinates at an  higher spatial
> resolution than the target variable sample. This is not  my case. because res is
> sampled at the points of sm.
> I had to define the grid by myself.
>
> #read data
>> casa032009<- read.csv("C:/data/morfeo/morfeo_data/casa_032009.csv")
>> str(casa032009)
> 'data.frame':   533 obs. of  9 variables:
>   $ objectid: int  489 490 491 492 493 494 495 496 1 2 ...
>   $ row     : int  185 180 175 170 165 160 155 150 195 190 ...
>   $ col     : Factor w/ 13 levels "A","B","C","D",..: 1 1 1 1 1 1 1 1 2 2 ...
>   $ key     : Factor w/ 533 levels "A0","A10","A100",..: 21 20 19 18 17 16 15 14
> 64 63 ...
>   $ ka      : num  23.4 19 14.4 14.5 16.6 16.8 14.2 14.8 16.7 21.5 ...
>   $ sm      : num  37.3 32 26.7  26.7 29 29.3 26.5 27.1 29.2 35.2 ...
>   $ x       : num  288097 288102 288107 288112 288117 ...
>   $ y       : num  4754051 4754051 4754052 4754052 4754053 ...
>   $ res     : num  55.1 56.3 63 62.9 57.2 ...
> #explicit spatial object
>> coordinates(casa032009)<- c("x", "y")
> # divide the data set in calibration and  validation groups
>    >  dim(casa032009 at data)[1]/3
>     >valid.id<- sample(1:dim(casa032009 at data)[1],
> 180)
>
>     >casa032009.val<- casa032009[valid.id,]  #valiadtion data set
>     >str(casa032009.val)
>      >casa032009.cal<-  casa032009[!is.element(1:dim(casa032009 at data)[1],
> valid.id),]   #calibration data set, almost 350 points
>     >str(casa032009.cal)
> #regression model
>    >lm.casa<-  lm(sm ~ res, data= as.data.frame(casa032009.cal))
>     >summary(lm.casa)
> #variogram analysis
>> v.rk<- variogram(residuals(lm.casa)~1, casa032009.cal)
>    >  plot(v.rk, pl=T)
>     >vm.rk<- vgm(psill=16, "Exp", range=9, nugget=2) #eye-ball fit
>     >plot(v.rk, pl=T, model=vm.rk)
>     >vmf.rk<- fit.variogram(v.rk, vm.rk) #statistical fit
>     >vmf.rk
>    >plot(v.rk, pl=T, model=vmf.rk)
>> bbox(casa032009)
>> (min.x<- bbox(casa032009)["x","min"]-2)
>> (min.y<- bbox(casa032009)["y","min"]-1.75)
>> (max.x<- bbox(casa032009)["x","max"]+2)
>> (max.y<- bbox(casa032009)["y","max"]+1.75)
>> (cells.x<- (max.x - min.x)/4)
>> (cells.y<- (max.y - min.y)/4)
>> casa.grid<- (GridTopology(c(min.x+2.5, min.y+2.5), c(5, 5), c(cells.x,
>> cells.y)))
>> casa.grid<- SpatialGrid(casa.grid)
>> str(casa.grid)
>
> sm.rk<- krige(sm ~ res, location=casa032009.cal, newdata=casa.grid,
> model=vmf.rk)
>
> The problem is : I understand the 'newdata' parameter of the krige function
> needs a SpatialGrifDataFrame containig res data at each grid node.  I don't know
> how create a SpatialGridDataFrames storing res data on points that are different
> from the ones that were sampled.
>
> Thanks for any help.
>
> Giuseppe
>
>
>
> 	[[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-Geo mailing list
> R-sig-Geo at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo



More information about the R-sig-Geo mailing list