[R-sig-Geo] How to speed up "extract" function in raster package?
Kamil Konowalik
konno_kazuma at mailplus.pl
Wed Jan 20 15:42:22 CET 2016
Normal 0 21 false false false PL X-NONE X-NONE MicrosoftInternetExplorer4 Hi,thank's Mike and Steve. My rasters were covering whole Europe and each of 19 ASCII files was about 200 MB (ncols 8710, nrows 4754).However when I wanted to quit those very long calculations this morning and try to make it differently - I noticed that apparently they were finished during the night after 13 days... So now I will proceed with my data and when I will have to run this procedure again (and I will have to) I will use your suggestions definitely.
Best wishes,
Kamil /* Style Definitions */ table.MsoNormalTable{mso-style-name:Standardowy;mso-tstyle-rowband-size:0;mso-tstyle-colband-size:0;mso-style-noshow:yes;mso-style-priority:99;mso-style-qformat:yes;mso-style-parent:"";mso-padding-alt:0cm 5.4pt 0cm 5.4pt;mso-para-margin:0cm;mso-para-margin-bottom:.0001pt;line-height:150%;mso-pagination:widow-orphan;font-size:11.0pt;font-family:"Calibri","sans-serif";mso-ascii-font-family:Calibri;mso-ascii-theme-font:minor-latin;mso-fareast-font-family:"Times New Roman";mso-fareast-theme-font:minor-fareast;mso-hansi-font-family:Calibri;mso-hansi-theme-font:minor-latin;}
Dnia 18 stycznia 2016 21:59 Michael Sumner <mdsumner at gmail.com> napisał(a):
On Mon, 18 Jan 2016 at 22:03 Kamil Konowalik <konno_kazuma at mailplus.pl> wrote:Dear list members,
I'm trying to extract values for ca. 1000 points from 19 raster files. It is a very simple task - as an output I need a table where each point has additional 19 columns with values derived from those raster files (specifically I'm creating a SWD input file for Maxent). I used extract function but the whole task is running very slow - so far the whole process took 10 days and it is not finished yet. My computer is rather slow (Windows 7 32 bit, intel core duo 2.2 GHz, 3 GB RAM) but currently I need to use this machine. I was wondering whether there is a way to speed up the whole process by using a different command or maybe there is any trick that can speed it up?
Here's the code I'm using:
files <- list.files("C:/GIS/worldclim/biolcimatic_variables_ASCII",pattern='asc',full.names=TRUE)
Grids <- raster::stack(files)
background <- read.csv("C:/GIS/species_background/bg.csv",header=TRUE)
LonLatData2 <- background[,c(2,3)]
var_at_background <- raster::extract(Grids,LonLatData2) #I'm here since 10 days
outfile2 <- as.data.frame(cbind("species",LonLatData2,var_at_background))
colnames(outfile2) <- c("species","longitude","latitude",colnames(var_at_background))
write.csv(outfile2, file="variables_background.csv", append = FALSE, sep = ",", eol = "\n", na = "NA", dec = ".", col.names = TRUE, row.names = FALSE)
Probably the best thing to do is get you data out of those .asc files and into something more sensible, like raster's native .grd format. Please let us know the dimensions of your raster, the print-out of Grids would suffice. Otherwise, try this files <- list.files("C:/GIS/worldclim/biolcimatic_variables_ASCII",pattern='asc',full.names=TRUE)Grids0 <- raster::stack(files)Grids <- writeRaster(Grids0, "native.grd") ## best if you can put it on a different physical disk## Grids <- writeRaster(Grids0, "D:/some/where/native.grd") ## then, proceed as you were background <- read.csv("C:/GIS/species_background/bg.csv",header=TRUE)LonLatData2 <- background[,c(2,3)]var_at_background <- raster::extract(Grids,LonLatData2) #I'm here since 10 days I include an option to write out to a different physical disk, you should really do that if you can - if your read and write to the same disk one process has to wait for the other. Also, if your data can just fit in memory that would be the fastest all out. ASC is possibly the worst format to you use for data like these, it's text, it's bloated, has insufficient metadata, can't be tiled or compressed internally and really there's no excuse these days. Neither of these apply to your situation, but if you have to do a lot of this kind of stuff, note that extract on a single-layer Brick can be much faster than on a RasterLayer - I don't know why yet, and extract() is also not suited to internally tiled rasters (common to GeoTIFF) since it scans line by line which is inefficient when the thing is tiled. . Cheers, Mike. I started to use R relatively recently so excuse me if there is something that I missed here but I was searching for an answer without any success.
Best regards,
Kamil
Wrocław University of Environmental and Life Sciences, Poland
_______________________________________________
R-sig-Geo mailing list
R-sig-Geo at r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-geo--Dr. Michael Sumner
Software and Database Engineer
Australian Antarctic Division
203 Channel Highway
Kingston Tasmania 7050 Australia
[[alternative HTML version deleted]]
More information about the R-sig-Geo
mailing list