[R-sig-Geo] Extract polygon values from large raster dataset

Matthew Landis landis at isciences.com
Tue Jun 19 03:54:16 CEST 2012


Spencer,  it's been awhile since I tried to use rasterize(), but when I 
did, I found it very slow compared to alternative approaches, e.g. using 
ArcGIS.  In the end, I discovered gdal_rasterize ( 
http://www.gdal.org/gdal_rasterize.html), which seems to be faster, 
although I've never compared them head to head.  That's what I now use 
on a regular basis.  Maybe it will work for you?

M

On 06/18/2012 07:07 PM, Spencer Scheidt wrote:
> Hello,
>
> I am attempting to extract land cover values from the NLCD 2006 dataset
> (http://www.mrlc.gov/nlcd2006.php) using a grid that I have converted into a
> SpatialPolygons* object.  However, the NLCD raster is approximately 16 GB in
> size (30 m resolution), and the grid that I would like to work with contains
> 5520 polygons (0.5 degree resolution).
>
> I have written a script based on previous postings that have asked a similar
> question and Applied Spatial Data Analysis in R by Bivand et. al, and used
> this approach successfully for smaller SpatialPolygons* objects with the
> same NLCD dataset.
>
> What I am wondering is if there is a quicker way to extract my polygon
> values, either by breaking the raster and polygons into smaller chunks, or
> using an entirely different approach.  I realize that the extreme sizes of
> my 2 objects may be the ultimate constraint, and if necessary I can reduce
> the resolution of my grid.
>
> Below is my working code, which uses 2 different approaches, via rasterize
> (estimated to take 10+ days to run and cross tabulate) and extract
> (estimated to take over 30 days to run):
>
> #load appropriate libraries
> library(maps)
> library(rgdal)
> library(raster)
> library(sp)
> library(maptools)
>
> #create bounding box
> us.box<-matrix(c(-124.5,-67,25,49), nrow = 2, ncol = 2, byrow = TRUE)
>
> #create grid
> getClass("GridTopology")
>
> cs<-c(0.5, 0.5)
> cc<-us.box[,1] + (cs/2)
> cd<-ceiling(diff(t(us.box))/cs)
> us.grid<-GridTopology(cellcentre.offset = cc, cellsize = cs, cells.dim = cd)
> us.grid
>
> #create spatial polygons
> #projection is albers equal area conic
> prj.string<- "+proj=aea +lat_1=29.5 +lat_2=45.5 +lat_0=23 +lon_0=-96
> +units=m"
> us.sp<- as.SpatialPolygons.GridTopology(us.grid, proj4string =
> CRS("+proj=longlat +ellps=WGS84"))
> us.SP<-spTransform(us.sp, CRS(prj.string))
>
> summary(us.SP)
>
> #check to make sure ID values are appropriate - should be listed by row from
> top left
> IDvaluesGridTopology(us.grid)
> slot(us.SP, "polygons")
>
> #load land cover data raster (16 GB)
> env.data = raster('nlcd2006_landcover_4-20-11_se5.img')
> projection(env.data) = CRS("+proj=aea +lat_1=29.5 +lat_2=45.5 +lat_0=23
> +lon_0=-96")
>
> #create raster of polygons
> poly.raster<- rasterize(us.SP, env.data, progress = 'window')
> df<- crosstab(poly.raster, env.data, progress = 'window')
>
> #maybe another way?
> ex.test<- extract(env.data, us.SP, df=TRUE, progress = 'window')
>
> Any help or advice on working with large datasets is much appreciated.
> Thank you for your time.
>
> Spencer
>
>
>
> --
> View this message in context: http://r-sig-geo.2731867.n2.nabble.com/Extract-polygon-values-from-large-raster-dataset-tp7580250.html
> Sent from the R-sig-geo mailing list archive at Nabble.com.
>
> _______________________________________________
> R-sig-Geo mailing list
> R-sig-Geo at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo


-- 
~~~~~~~~~~~~~~~~~~~~~~~~~~
Matthew Landis, Ph.D.
Research Scientist
ISciences, LLC
61 Main St. Suite 200
Burlington VT 05401
802.864.2999
www.isciences.com
~~~~~~~~~~~~~~~~~~~~~~~~~~



More information about the R-sig-Geo mailing list