[R-sig-Geo] Parallel predict now in spatial.tools

Jonathan Greenberg jgrn at illinois.edu
Wed Mar 19 04:14:23 CET 2014


R-sig-geo'ers:

I finally got around to building a parallel predict statement that
I've included in version 1.3.7 (or later) of spatial.tools (check
http://r-forge.r-project.org/R/?group_id=1492 for the status of the
build), "predict_rasterEngine".  It should, in theory, be a direct
swap-in for the standard generic predict() statement.  Currently, it
will work on any predict.* statement that has the following features:
1) The data is passed to the predict as a data frame via a newdata
parameter, and
2) The data is returned from the predict statement as a vector/matrix.

When using predict_rasterEngine, the object= parameter is your model,
and the newdata= parameter is the raster/brick/stack to apply the
model to on a pixel-by-pixel basis (note that the names of the layers
must match the names of the predictor variables, in most cases).

I was hoping to get some stress-testing on this, since it is a fairly
oft-requested function.  If a predict.* function you'd like to use
doesn't work, let me know which function it is (with some test data)
and I'll see if I can tweak it to work.

Right now, I have confirmed this works with randomForest.  Here's an example:

######################

packages_required <- c("spatial.tools","doParallel","randomForest")
lapply(packages_required, require, character.only=T)

# Load up a 3-band image:
tahoe_highrez <- setMinMax(
brick(system.file("external/tahoe_highrez.tif", package="spatial.tools")))
tahoe_highrez
plotRGB(tahoe_highrez)

# Load up some training points:
tahoe_highrez_training_points <- readOGR(
dsn=system.file("external", package="spatial.tools"),
layer="tahoe_highrez_training_points")

# Extract data to train the randomForest model:
tahoe_highrez_training_extract <- extract(
tahoe_highrez,
tahoe_highrez_training_points,
df=TRUE)

# Fuse it back with the SPECIES info:
tahoe_highrez_training_extract$SPECIES <- tahoe_highrez_training_points$SPECIES

# Note the names of the bands:
names(tahoe_highrez_training_extract) # the extracted data
names(tahoe_highrez) # the brick

# Generate a randomForest model:
tahoe_rf <- randomForest(SPECIES~tahoe_highrez.1+tahoe_highrez.2+tahoe_highrez.3,
data=tahoe_highrez_training_extract)

tahoe_rf

# This will run the predict in parallel:
sfQuickInit()
prediction_rf_class <-
predict_rasterEngine(object=tahoe_rf,newdata=tahoe_highrez,type="response")
prediction_rf_prob <-
predict_rasterEngine(object=tahoe_rf,newdata=tahoe_highrez,type="prob")
sfQuickStop()

###############

--j




-- 
Jonathan A. Greenberg, PhD
Assistant Professor
Global Environmental Analysis and Remote Sensing (GEARS) Laboratory
Department of Geography and Geographic Information Science
University of Illinois at Urbana-Champaign
259 Computing Applications Building, MC-150
605 East Springfield Avenue
Champaign, IL  61820-6371
Phone: 217-300-1924
http://www.geog.illinois.edu/~jgrn/
AIM: jgrn307, MSN: jgrn307 at hotmail.com, Gchat: jgrn307, Skype: jgrn3007



More information about the R-sig-Geo mailing list