[R-sig-Geo] randomForests for mapping vegetation

Wed Jan 11 11:28:19 CET 2006

On Tue, 10 Jan 2006, Miewald, Tom wrote:

> 
> Hello all,
> 
> I am new to this list and wondering whether anyone has any experience
> (or ideas) for how to implement vegetation mapping using the
> randomForests package from R.  The model produced from randomForests
> would be used to map vegetation from Landsat (30 x 30 meter pixels) for
> relatively large areas (> 10 million hectares, so a lot of pixels).  
> There are ~15 explanatory data sets (imagery, dems,precip, etc).  My
> main question concerns how to use the output from randomForests to
> predict vegetation over such an area.  I have seen some literature out
> there using GRASS.  I would rather not go down that road because I
> already have enough software packages.  Is there any possibilities for
> using ArcGIS connectivity to enable the prediction of vegetation?  Any
> input would be appreciated.  Thanks!

Have a look at:

http://www.ci.tuwien.ac.at/Conferences/DSC-2003/Drafts/FurlanelloEtAl.pdf

which is fairly close to your description, although not such a large 
number of pixels, and does use R/GRASS integration.

My guess would be that you should tile the region for prediction into 
subregions, and patch them together back in AcrGIS. You could do it by 
writing out Arc ASCII grids using the write.asciigrid() function in 
the maptools package. If this is going to be more heavyweight production, 
then using the Rcom interface from VBA in ArcGIS might also be possible, 
if the whole process is going to have to be repeated many times. We are 
also looking at writing geotiffs from rdgal, so you should be able to find 
a suitable route from the subregional predictions within R back to rasters 
in ArcGIS. 

Examples using VBA are shown here:

http://perso.univ-lr.fr/csaintje/Recherche/RArcgis/index.html

and a nice interface using Python from ArcGIS then Rcom:

http://www.nicholas.duke.edu/geospatial/software/

Most of the hard work will be in getting things to work once, from there 
it'll get easier. Consider save()ing the RF model output, so that to make 
predictions, you only need to load() then predict() for newdata (grids of 
RHS variables) for the current subregion. This also parallelises nicely, 
so you could also pass off subregions and the fitted model to slaves to do 
the predictions, but getting it working will take substantial time. By the 
way, OSX and Linux memory management will be better than Windows, so on 
Windows, go for smaller subregions.

Roger

> Tom
> 
> CONFIDENTIALITY AND DISCLAIMER: This message and any attachments hereto
> are intended only for the use of the addressee(s) and may be legally
> privileged and/or confidential. Any dissemination, distribution,
> printing, forwarding, or any method of copying of this message or any
> attachment hereto, and/or the taking of any action in reliance on the
> information herein or in any attachment hereto is strictly prohibited
> except by the original intended recipient. If you have received this
> communication in error, please immediately notify the sender, and
> permanently delete this message and any attachment hereto from your
> computer or storage system, and destroy any printout thereof. Although
> reasonable precautions have been taken to ensure no viruses are present
> in this message or any attachment hereto, The Sanborn Map Company, Inc.
> takes no responsibility and has no liability for any virus which may be
> transferred via this message or any attachment hereto.
> 
> (svr28)
> 
> _______________________________________________
> R-sig-Geo mailing list
> R-sig-Geo at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
> 

-- 
Roger Bivand
Economic Geography Section, Department of Economics, Norwegian School of
Economics and Business Administration, Helleveien 30, N-5045 Bergen,
Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: Roger.Bivand at nhh.no