[R-sig-Geo] Spatial random forest prediction: Error when predicting at unseen locations at finer spatial scale

Marcelino de la Cruz Rot m@rce||no@de|@cruz @end|ng |rom urjc@e@
Thu Feb 23 08:39:38 CET 2023


Hi Nikolaos:

As you can read in the github page of the package 
(https://blasbenito.github.io/spatialRF/),

*"there are many things this package cannot do*:

  *

    Predict model results over raster data.

  *

    Predict a model result over another region with a different spatial
    structure.

  * ..."

More specifically, the error you report is probably related to a lot of 
spatial predictors derived from the distance matrix (that are included 
in the fitted model) that you are not passing to "predict" the model.


Cheers,
Marcelino


El 22/02/2023 a las 21:36, Nikolaos Tziokas escribió:
> I am using the package spatialRF in R for a spatial random forest
> regression (SRFR) task. I have one response variable and 4 predictors and I
> am performing SRFR at a coarse spatial scale. My goal is to take the model
> parameters and apply them to a finer spatial resolution in order to predict
> the response variable at the finer spatial scale.
>
> When I run
>
> p <- stats::predict(object = model.spatial,           #name of the
> spatialRF model
>                      data = s,                         # data.frame
> containing the predictors at the fine spatial scale (without NaN values)
>                      type = "response")$predictions
> I am getting this error: Error in predict.ranger.forest(forest, data,
> predict.all, num.trees, type,: Error: One or more independent variables not
> found in data.
>
> I have checked the column names of s and my original data.frame (the one I
> used to build the model at the coarse scale) and they are the same. How can
> I use the model i created at the coarse scale to predict the response
> variable at a finer spatial scale?
>
> Here is the code:
>
> library(spatialRF)
> library(stats)
>
> wd = "path/"
>
> block.data = read.csv(paste0(wd, "block.data.csv")) # coarse resolution
>
> #names of the response variable and the predictors
> dependent.variable.name <- "ntl"
> predictor.variable.names <- colnames(block.data)[4:7]
>
> #coordinates of the cases
> xy <- block.data[, c("x", "y")]
>
> block.data$x <- NULL
> block.data$y <- NULL
>
> #distance matrix
> distance.matrix <- as.matrix(dist(block.data))
> min(distance.matrix)
> max(distance.matrix)
>
> #distance thresholds (same units as distance_matrix)
> distance.thresholds <- c(0, 20, 50, 100, 200, 500)
>
> #random seed for reproducibility
> random.seed <- 456
>
> #creating and registering the cluster
>      local.cluster <- parallel::makeCluster(
>        parallel::detectCores() - 1,
>        type = "PSOCK")
>      doParallel::registerDoParallel(cl = local.cluster)
>
> # fitting a non-spatial Random Forest
> model.non.spatial <- spatialRF::rf(
> data = block.data,
> dependent.variable.name = dependent.variable.name,
> predictor.variable.names = predictor.variable.names,
> distance.matrix = distance.matrix,
> distance.thresholds = distance.thresholds,
> xy = xy,
> seed = random.seed,
> verbose = FALSE)
>
> # Fitting a spatial model with rf_spatial()
> model.spatial <- spatialRF::rf_spatial(
>    model = model.non.spatial,
>    method = "mem.moran.sequential",
>    verbose = FALSE,
>    seed = random.seed)
>
> #stopping the cluster
> parallel::stopCluster(cl = local.cluster)
>
> # prediction at a finer spatial scale
> s = read.csv(paste0(wd, "s.csv")) # df containg the predictors at fine
> scale
>
> p <- stats::predict(object = model.spatial,
>                      data = s,
>                      type = "response")$predictions
>
> I tried solutions like:
>
> levels(s$lc) <- levels(block.data$lc)
>
> in case I had missing land cover types in the lc column between the spatial
> scales, but it didn't work.
>
> >From here
> <https://drive.google.com/drive/folders/1KhnQEajpSKh59XuWkxTZcc_2YxPcxYW7?usp=sharing>
> you can download the two data.frames.
>
> 	[[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-Geo mailing list
> R-sig-Geo using r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo


-- 
Marcelino de la Cruz Rot
Depto. de Biología y Geología
Física y Química Inorgánica
Universidad Rey Juan Carlos
Móstoles España



More information about the R-sig-Geo mailing list