[R-sig-Geo] Spatial random forest prediction: Error when predicting at unseen locations at finer spatial scale
Nikolaos Tziokas
n|ko@@tz|ok@@ @end|ng |rom gm@||@com
Wed Feb 22 21:36:21 CET 2023
I am using the package spatialRF in R for a spatial random forest
regression (SRFR) task. I have one response variable and 4 predictors and I
am performing SRFR at a coarse spatial scale. My goal is to take the model
parameters and apply them to a finer spatial resolution in order to predict
the response variable at the finer spatial scale.
When I run
p <- stats::predict(object = model.spatial, #name of the
spatialRF model
data = s, # data.frame
containing the predictors at the fine spatial scale (without NaN values)
type = "response")$predictions
I am getting this error: Error in predict.ranger.forest(forest, data,
predict.all, num.trees, type,: Error: One or more independent variables not
found in data.
I have checked the column names of s and my original data.frame (the one I
used to build the model at the coarse scale) and they are the same. How can
I use the model i created at the coarse scale to predict the response
variable at a finer spatial scale?
Here is the code:
library(spatialRF)
library(stats)
wd = "path/"
block.data = read.csv(paste0(wd, "block.data.csv")) # coarse resolution
#names of the response variable and the predictors
dependent.variable.name <- "ntl"
predictor.variable.names <- colnames(block.data)[4:7]
#coordinates of the cases
xy <- block.data[, c("x", "y")]
block.data$x <- NULL
block.data$y <- NULL
#distance matrix
distance.matrix <- as.matrix(dist(block.data))
min(distance.matrix)
max(distance.matrix)
#distance thresholds (same units as distance_matrix)
distance.thresholds <- c(0, 20, 50, 100, 200, 500)
#random seed for reproducibility
random.seed <- 456
#creating and registering the cluster
local.cluster <- parallel::makeCluster(
parallel::detectCores() - 1,
type = "PSOCK")
doParallel::registerDoParallel(cl = local.cluster)
# fitting a non-spatial Random Forest
model.non.spatial <- spatialRF::rf(
data = block.data,
dependent.variable.name = dependent.variable.name,
predictor.variable.names = predictor.variable.names,
distance.matrix = distance.matrix,
distance.thresholds = distance.thresholds,
xy = xy,
seed = random.seed,
verbose = FALSE)
# Fitting a spatial model with rf_spatial()
model.spatial <- spatialRF::rf_spatial(
model = model.non.spatial,
method = "mem.moran.sequential",
verbose = FALSE,
seed = random.seed)
#stopping the cluster
parallel::stopCluster(cl = local.cluster)
# prediction at a finer spatial scale
s = read.csv(paste0(wd, "s.csv")) # df containg the predictors at fine
scale
p <- stats::predict(object = model.spatial,
data = s,
type = "response")$predictions
I tried solutions like:
levels(s$lc) <- levels(block.data$lc)
in case I had missing land cover types in the lc column between the spatial
scales, but it didn't work.
>From here
<https://drive.google.com/drive/folders/1KhnQEajpSKh59XuWkxTZcc_2YxPcxYW7?usp=sharing>
you can download the two data.frames.
[[alternative HTML version deleted]]
More information about the R-sig-Geo
mailing list