[R-sig-Geo] Extract polygon values from large raster dataset

Spencer Scheidt scheidt14 at gmail.com
Tue Jun 19 01:07:30 CEST 2012


Hello,

I am attempting to extract land cover values from the NLCD 2006 dataset
(http://www.mrlc.gov/nlcd2006.php) using a grid that I have converted into a
SpatialPolygons* object.  However, the NLCD raster is approximately 16 GB in
size (30 m resolution), and the grid that I would like to work with contains
5520 polygons (0.5 degree resolution).

I have written a script based on previous postings that have asked a similar
question and Applied Spatial Data Analysis in R by Bivand et. al, and used
this approach successfully for smaller SpatialPolygons* objects with the
same NLCD dataset.  

What I am wondering is if there is a quicker way to extract my polygon
values, either by breaking the raster and polygons into smaller chunks, or
using an entirely different approach.  I realize that the extreme sizes of
my 2 objects may be the ultimate constraint, and if necessary I can reduce
the resolution of my grid.

Below is my working code, which uses 2 different approaches, via rasterize
(estimated to take 10+ days to run and cross tabulate) and extract
(estimated to take over 30 days to run):

#load appropriate libraries
library(maps)
library(rgdal)
library(raster)
library(sp)
library(maptools)

#create bounding box
us.box<-matrix(c(-124.5,-67,25,49), nrow = 2, ncol = 2, byrow = TRUE)

#create grid
getClass("GridTopology")

cs<-c(0.5, 0.5)
cc<-us.box[,1] + (cs/2)
cd<-ceiling(diff(t(us.box))/cs)
us.grid<-GridTopology(cellcentre.offset = cc, cellsize = cs, cells.dim = cd)
us.grid

#create spatial polygons
#projection is albers equal area conic
prj.string <- "+proj=aea +lat_1=29.5 +lat_2=45.5 +lat_0=23 +lon_0=-96
+units=m"
us.sp <- as.SpatialPolygons.GridTopology(us.grid, proj4string =
CRS("+proj=longlat +ellps=WGS84"))
us.SP<-spTransform(us.sp, CRS(prj.string))

summary(us.SP)

#check to make sure ID values are appropriate - should be listed by row from
top left
IDvaluesGridTopology(us.grid)
slot(us.SP, "polygons")

#load land cover data raster (16 GB)
env.data = raster('nlcd2006_landcover_4-20-11_se5.img')
projection(env.data) = CRS("+proj=aea +lat_1=29.5 +lat_2=45.5 +lat_0=23
+lon_0=-96") 

#create raster of polygons
poly.raster <- rasterize(us.SP, env.data, progress = 'window')
df <- crosstab(poly.raster, env.data, progress = 'window')

#maybe another way?
ex.test <- extract(env.data, us.SP, df=TRUE, progress = 'window')

Any help or advice on working with large datasets is much appreciated. 
Thank you for your time.

Spencer



--
View this message in context: http://r-sig-geo.2731867.n2.nabble.com/Extract-polygon-values-from-large-raster-dataset-tp7580250.html
Sent from the R-sig-geo mailing list archive at Nabble.com.



More information about the R-sig-Geo mailing list