[R-sig-Geo] Extract vs Zonal: Efficient method to extract and averaging values from large raster datasets using polygons

Mon Oct 31 14:21:30 CET 2016

Hi all,

I need to know which is the quickest way to extract values from a large
raster datasets (> 300 layers) using polygons (SpatialPolygonsDataFrame).

Also, I need to know if worth it to use some parallelization, or just be
enough to start the cluster with beginCluster(n) at the beginning, knowing
that 'extract' use parallelization.

Using "extract" function will be something like this:

> library(raster)
> library(maptools)
> rasters <- stack(files_rasters)
> polygons <- readShapePoly(file_shape)

> beginCluster(8)
> dataOut <- extract(rasters,polygons,fun='mean')
 > endCluster()

In the case of select the "zonal" function will be:

> library(raster)
> library(maptools)
> rasters <- stack(files_rasters)
> polygons <- readShapePoly(file_shape)
> polygonsRaster <- rasterize(polygons, subset(rasters,1))

> beginCluster(8)
> dataOut <- zonal(rasters, polygonsRaster, 'mean')
> endCluster()

Which of those would have the quickest result?
There is another quickest way to do it?
If not, for those methods worth it to try some improvements such as
parallelization?

I've reviewed some discussion, but I think still there is not an answer to
conclusive about it. Right now, I'm testing different approach, using
smaller subset data but I still don't have a conclusion.

Best to all,

Francisco Zambrano
Ph.D. Candidate from University of Concepcion, Chile.
Visiting researcher at ITC, University of Twente, Netherlands.

frzambra.github.io
Agricultural Drought Webmapping <https://frzambra.shinyapps.io/shinyapp/>

	[[alternative HTML version deleted]]