[R-sig-Geo] More efficient raster extraction

Denys Dukhovnov denn2duk at yahoo.com
Mon Sep 14 19:39:36 CEST 2015

I am trying to extract 300 meter raster cell values over Census block polygons for each of the US states. Because 300m cells are much larger than typical urban blocks, I had to specify "small" option (as shown below) so that no polygon will remain "blank" without a corresponding raster value(s). Blocks shapefile for District Columbia with about 6,500 blocks, for example, being the smallest state, takes 15 minutes to complete, but New York with almost 350,000 block polygons would not complete even in 4 days. On a separate occasion I also tried using parallel processing (foreach), but this only appears to slow down the extract function, rather than speeding it up. I clipped the main raster to the extent of each state, but this doesn't help (not much, anyway). In this case the vector data are huge, so I would expect any improvement in efficiency to come from reducing its burden on the process.

 Please share any advice as to how I can make the extract function faster and more efficient. Thank you very much!

Faster method: Without parallel processing:

 BlockIndex <- extract(Raster_300m, Blocks, df=TRUE, small=TRUE)

Indefinitely slow: With parallel processing (8 registered cores):

BlockIndex <- foreach (i=1:length(Blocks$GEOID10), .combine='c', .packages="raster") %dopar% {
        extract(Raster_300m, Blocks[i], df=TRUE, small=TRUE)}


Denys Dukhovnov

	[[alternative HTML version deleted]]

More information about the R-sig-Geo mailing list