<HTML><HEAD>
<META content="text/html; charset=utf-8" http-equiv=Content-Type>
<META name=GENERATOR content="MSHTML 8.00.6001.23562"></HEAD>
<BODY style="MARGIN: 4px 4px 1px; FONT: 10pt Tahoma">
<DIV>Jonathan,</DIV>
<DIV>Thank you for putting this together and for the example. I'm doing two things differently with randomForest ... I think perhaps one of them the function isn't handling. </DIV>
<DIV> </DIV>
<DIV>First, based on recommendations from Andy Liaw (and ?randomForest), I don't use the formula interface but use x=<many columns>, y=<a column> in the call. Does predict_rasterEngine handle the absence of a formula in the object?</DIV>
<DIV> </DIV>
<DIV>Second, I have many large rasters I want to run the predict on, so making a brick would be difficult. I use a rasterStack instead. Does your example work with a rasterStack? </DIV>
<DIV> </DIV>
<DIV>I can dive deeper if any of this isn't clear or if these two tweaks work just fine for you. I was just trying to swap out this version of predict with another parallel version to evaluate speed and, while the alternate version works fine, predict_rasterEngine bailed on me. </DIV>
<DIV> </DIV>
<DIV>Thanks in advance. </DIV>
<DIV>Tim Howard</DIV>
<DIV> </DIV>
<DIV> </DIV>
<DIV><BR>>>>>>><BR>Date: Tue, 18 Mar 2014 22:14:23 -0500<BR>From: Jonathan Greenberg <jgrn@illinois.edu><BR>To: "r-sig-geo@r-project.org" <R-sig-Geo@r-project.org><BR>Subject: [R-sig-Geo] Parallel predict now in spatial.tools<BR>Message-ID:<BR><CABG0rfseg+p0h4HdYOK+_Za=OLMeTKHAT+TQn7g_FkEdYiunFQ@mail.gmail.com><BR>Content-Type: text/plain; charset=ISO-8859-1<BR><BR>R-sig-geo'ers:<BR><BR>I finally got around to building a parallel predict statement that<BR>I've included in version 1.3.7 (or later) of spatial.tools (check<BR><A href="http://r-forge.r-project.org/R/?group_id=1492">http://r-forge.r-project.org/R/?group_id=1492</A> for the status of the<BR>build), "predict_rasterEngine". It should, in theory, be a direct<BR>swap-in for the standard generic predict() statement. Currently, it<BR>will work on any predict.* statement that has the following features:<BR>1) The data is passed to the predict as a data frame via a newdata<BR>parameter, and<BR>2) The data is returned from the predict statement as a vector/matrix.<BR><BR>When using predict_rasterEngine, the object= parameter is your model,<BR>and the newdata= parameter is the raster/brick/stack to apply the<BR>model to on a pixel-by-pixel basis (note that the names of the layers<BR>must match the names of the predictor variables, in most cases).<BR><BR>I was hoping to get some stress-testing on this, since it is a fairly<BR>oft-requested function. If a predict.* function you'd like to use<BR>doesn't work, let me know which function it is (with some test data)<BR>and I'll see if I can tweak it to work.<BR><BR>Right now, I have confirmed this works with randomForest. Here's an example:<BR><BR>######################<BR><BR>packages_required <- c("spatial.tools","doParallel","randomForest")<BR>lapply(packages_required, require, character.only=T)<BR><BR># Load up a 3-band image:<BR>tahoe_highrez <- setMinMax(<BR>brick(system.file("external/tahoe_highrez.tif", package="spatial.tools")))<BR>tahoe_highrez<BR>plotRGB(tahoe_highrez)<BR><BR># Load up some training points:<BR>tahoe_highrez_training_points <- readOGR(<BR>dsn=system.file("external", package="spatial.tools"),<BR>layer="tahoe_highrez_training_points")<BR><BR># Extract data to train the randomForest model:<BR>tahoe_highrez_training_extract <- extract(<BR>tahoe_highrez,<BR>tahoe_highrez_training_points,<BR>df=TRUE)<BR><BR># Fuse it back with the SPECIES info:<BR>tahoe_highrez_training_extract$SPECIES <- tahoe_highrez_training_points$SPECIES<BR><BR># Note the names of the bands:<BR>names(tahoe_highrez_training_extract) # the extracted data<BR>names(tahoe_highrez) # the brick<BR><BR># Generate a randomForest model:<BR>tahoe_rf <- randomForest(SPECIES~tahoe_highrez.1+tahoe_highrez.2+tahoe_highrez.3,<BR>data=tahoe_highrez_training_extract)<BR><BR>tahoe_rf<BR><BR># This will run the predict in parallel:<BR>sfQuickInit()<BR>prediction_rf_class <-<BR>predict_rasterEngine(object=tahoe_rf,newdata=tahoe_highrez,type="response")<BR>prediction_rf_prob <-<BR>predict_rasterEngine(object=tahoe_rf,newdata=tahoe_highrez,type="prob")<BR>sfQuickStop()<BR><BR>###############<BR><BR>--j</DIV></BODY></HTML>