[R-sig-Geo] randomForests for mapping vegetation
Roger Bivand
Roger.Bivand at nhh.no
Wed Jan 11 11:28:19 CET 2006
On Tue, 10 Jan 2006, Miewald, Tom wrote:
>
> Hello all,
>
> I am new to this list and wondering whether anyone has any experience
> (or ideas) for how to implement vegetation mapping using the
> randomForests package from R. The model produced from randomForests
> would be used to map vegetation from Landsat (30 x 30 meter pixels) for
> relatively large areas (> 10 million hectares, so a lot of pixels).
> There are ~15 explanatory data sets (imagery, dems,precip, etc). My
> main question concerns how to use the output from randomForests to
> predict vegetation over such an area. I have seen some literature out
> there using GRASS. I would rather not go down that road because I
> already have enough software packages. Is there any possibilities for
> using ArcGIS connectivity to enable the prediction of vegetation? Any
> input would be appreciated. Thanks!
Have a look at:
http://www.ci.tuwien.ac.at/Conferences/DSC-2003/Drafts/FurlanelloEtAl.pdf
which is fairly close to your description, although not such a large
number of pixels, and does use R/GRASS integration.
My guess would be that you should tile the region for prediction into
subregions, and patch them together back in AcrGIS. You could do it by
writing out Arc ASCII grids using the write.asciigrid() function in
the maptools package. If this is going to be more heavyweight production,
then using the Rcom interface from VBA in ArcGIS might also be possible,
if the whole process is going to have to be repeated many times. We are
also looking at writing geotiffs from rdgal, so you should be able to find
a suitable route from the subregional predictions within R back to rasters
in ArcGIS.
Examples using VBA are shown here:
http://perso.univ-lr.fr/csaintje/Recherche/RArcgis/index.html
and a nice interface using Python from ArcGIS then Rcom:
http://www.nicholas.duke.edu/geospatial/software/
Most of the hard work will be in getting things to work once, from there
it'll get easier. Consider save()ing the RF model output, so that to make
predictions, you only need to load() then predict() for newdata (grids of
RHS variables) for the current subregion. This also parallelises nicely,
so you could also pass off subregions and the fitted model to slaves to do
the predictions, but getting it working will take substantial time. By the
way, OSX and Linux memory management will be better than Windows, so on
Windows, go for smaller subregions.
Roger
> Tom
>
> CONFIDENTIALITY AND DISCLAIMER: This message and any attachments hereto
> are intended only for the use of the addressee(s) and may be legally
> privileged and/or confidential. Any dissemination, distribution,
> printing, forwarding, or any method of copying of this message or any
> attachment hereto, and/or the taking of any action in reliance on the
> information herein or in any attachment hereto is strictly prohibited
> except by the original intended recipient. If you have received this
> communication in error, please immediately notify the sender, and
> permanently delete this message and any attachment hereto from your
> computer or storage system, and destroy any printout thereof. Although
> reasonable precautions have been taken to ensure no viruses are present
> in this message or any attachment hereto, The Sanborn Map Company, Inc.
> takes no responsibility and has no liability for any virus which may be
> transferred via this message or any attachment hereto.
>
> (svr28)
>
> _______________________________________________
> R-sig-Geo mailing list
> R-sig-Geo at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>
--
Roger Bivand
Economic Geography Section, Department of Economics, Norwegian School of
Economics and Business Administration, Helleveien 30, N-5045 Bergen,
Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: Roger.Bivand at nhh.no
More information about the R-sig-Geo
mailing list