[R-sig-Geo] How to speed up "extract" function in raster package?

Michael Sumner mdsumner at gmail.com
Mon Jan 18 21:59:05 CET 2016


On Mon, 18 Jan 2016 at 22:03 Kamil Konowalik <konno_kazuma at mailplus.pl>
wrote:

> Dear list members,
> I'm trying to extract values for ca. 1000 points from 19 raster files. It
> is a very simple task - as an output I need a table where each point has
> additional 19 columns with values derived from those raster files
> (specifically I'm creating a SWD input file for Maxent). I used extract
> function but the whole task is running very slow - so far the whole process
> took 10 days and it is not finished yet. My computer is rather slow
> (Windows 7 32 bit, intel core duo 2.2 GHz, 3 GB RAM) but currently I need
> to use this machine. I was wondering whether there is a way to speed up the
> whole process by using a different command or maybe there is any trick that
> can speed it up?
>
> Here's the code I'm using:
>
> files <-
> list.files("C:/GIS/worldclim/biolcimatic_variables_ASCII",pattern='asc',full.names=TRUE)
> Grids <- raster::stack(files)
> background <- read.csv("C:/GIS/species_background/bg.csv",header=TRUE)
> LonLatData2 <- background[,c(2,3)]
> var_at_background <- raster::extract(Grids,LonLatData2) #I'm here since 10
> days
> outfile2 <- as.data.frame(cbind("species",LonLatData2,var_at_background))
> colnames(outfile2) <-
> c("species","longitude","latitude",colnames(var_at_background))
> write.csv(outfile2, file="variables_background.csv", append = FALSE, sep =
> ",", eol = "\n", na = "NA", dec = ".", col.names = TRUE, row.names = FALSE)
>
>


Probably the best thing to do is get you data out of those .asc files and
into something more sensible, like raster's native .grd format. Please let
us know the dimensions of your raster, the print-out of


Grids


would suffice. Otherwise, try this


files <-
list.files("C:/GIS/worldclim/biolcimatic_variables_ASCII",pattern='asc',full.names=TRUE)
Grids0 <- raster::stack(files)
Grids <- writeRaster(Grids0, "native.grd") ## best if you can put it on a
different physical disk
## Grids <- writeRaster(Grids0, "D:/some/where/native.grd")


## then, proceed as you were


background <- read.csv("C:/GIS/species_background/bg.csv",header=TRUE)
LonLatData2 <- background[,c(2,3)]
var_at_background <- raster::extract(Grids,LonLatData2) #I'm here since 10
days



I include an option to write out to a different physical disk, you should
really do that if you can - if your read and write to the same disk one
process has to wait for the other. Also, if your data can just fit in
memory that would be the fastest all out.


ASC is possibly the worst format to you use for data like these, it's text,
it's bloated, has insufficient metadata, can't be tiled or compressed
internally and really there's no excuse these days.


Neither of these apply to your situation, but if you have to do a lot of
this kind of stuff, note that extract on a single-layer Brick can be much
faster than on a RasterLayer - I don't know why yet, and extract() is also
not suited to internally tiled rasters (common to GeoTIFF) since it scans
line by line which is inefficient when the thing is tiled. .


Cheers, Mike.





> I started to use R relatively recently so excuse me if there is something
> that I missed here but I was searching for an answer without any success.
> Best regards,
> Kamil
>
> Wrocław University of Environmental and Life Sciences, Poland
>
> _______________________________________________
> R-sig-Geo mailing list
> R-sig-Geo at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo

-- 
Dr. Michael Sumner
Software and Database Engineer
Australian Antarctic Division
203 Channel Highway
Kingston Tasmania 7050 Australia

	[[alternative HTML version deleted]]



More information about the R-sig-Geo mailing list