[R-sig-Geo] issue in using projectRaster in raster library

Jonathan Greenberg jgrn at illinois.edu
Wed Apr 9 00:42:39 CEST 2014


Ping:

Re: raster:::projectRaster I'm not sure how much you can speed that up
(Robert?) -- I think it is native R code which is generally a lot
slower than compiled code like the GDAL utilities.  It has to use
"chunking", but for reasonable chunk sizes this is not likely to cause
any noticeable overhead.

Re: gdalUtils vs. cygwin -- you got basically the same execution time
-- 21.15 vs. 20.7 seconds is not worth worrying about (~ a 2% increase
using gdal from R)-- gdalUtils has the (small) overhead from R, but it
is actually using your installed gdalwarp in cygwin anyway, it is just
an R wrapper for it if you are more comfortable using R functions vs.
command line.  The first time you launch gdalwarp in an R session it
takes a few seconds to find your local installation of GDAL -- the
second time it runs, there should be almost no overhead.

Re: parallelization -- if you have GDAL 1.10.1 or later (perhaps a
slightly earlier version, not sure when it was implemented), you can
use -multi -wo NUM_THREADS=ALL_CPUS to parallelize the operation -- in
gdalUtils, this would be:

gdalwarp(...,multi=TRUE,wo="NUM_THREADS=ALL_CPUS")

# See: http://lists.osgeo.org/pipermail/gdal-dev/2012-June/033084.html

If you are trying to warp a LOT of images, you can also use e.g.
foreach() to warp each image using a different thread.  You'd have to
experiment with which one (loop through each image one at a time,
parallel processing each individual image vs. parallel processing
multiple images at the same time) is more efficient.

--j




On Tue, Apr 8, 2014 at 4:50 PM, ping yang <pingyang.whu at gmail.com> wrote:
> Hi Jonathan,
>
> I am following up with the previous thread, Now I successfully installed the
> gdalUtils from the source and ran successfully the task (both projection and
> slope calculation) using the following script:
>
> rasterOptions(chunksize = 3e+07, maxmemory = 1e+08) #suggestions according
> to Jonanthan, it made the raster::projectRaster work but for much longer
> time such as 40 minutes!!!
>
> #*----Function for detect the UTM Zone-----#
> long2UTM <- function(long) {
>   (floor((long + 180)/6) %% 60) + 1
> }
>
>   filename <- "grdn41w083_13"
>   longitude <- paste("-",substr(filename,9,10),sep="") #this could be change
> according to the directory string
>   lon <- as.numeric(longitude)
>   utm <-paste("+proj=utm +zone=",long2UTM(lon)," +datum=NAD83 +ellps=GRS80
> +towgs84=0,0,0 +units=m +no_defs",sep="")
>   outputfile <-
> paste("projected/",substr(filename,4,10),"_raster.tif",sep='')
>   gdalwarp(srcfile=filename, dstfile=outputfile,
> t_srs=utm,output_Raster=TRUE,overwrite=TRUE,multi=TRUE,
> tr=c(10,10),r="bilinear", verbose=TRUE)
>   slp <- gdaldem(mode="slope",input_dem=outputfile,
> output="n41w083_rgdalslp.tif",output_Raster=TRUE,verbose=TRUE)
>
> Now it took me 21.15 seconds to finish the projection(while using the
> gdalWarp in cygwin took about 20.7 seconds)
>
> I am wondering if there is way to speed up this process, How to parallel
> this process(parallelized warping)?
>
> Big Thanks,
>
> Ping
>
>
>
> On Mon, Apr 7, 2014 at 11:37 PM, Jonathan Greenberg <jgrn at illinois.edu>
> wrote:
>>
>> For gdalUtils, you don't need to upgrade to 3.2 -- the latest CRAN
>> version only requires R 2.14 or later.  It should find your cygwin
>> version, but let me know if it doesn't.
>>
>> The key with the rasterOptions you used is that, previously, you were
>> forcing the ENTIRE RASTER to load into memory with the max memory and
>> chunk size you used -- this is really not a good idea for 99.9% of
>> raster calculations.  I would try two things: 1) reduce the max memory
>> back to its default settings (1e+08 cells on my machine -- about 750mb
>> of RAM usage if each cell is 64 bits), and 2) adjust the chunk size to
>> maybe 250mb (3e+07 cells)  If you leave your settings like that,
>> particularly on Windows, you'll end up with out-of-memory crashes
>> fairly often (since I'm guessing you don't have 8TB of RAM on a
>> Windows box).
>>
>> --j
>>
>>
>>
>> On Mon, Apr 7, 2014 at 4:47 PM, ping yang <pingyang.whu at gmail.com> wrote:
>> > Hi Alex and Jonathan,
>> >
>> > The reason I used the rasterOptions is because when previously I did the
>> > reproject for 30 meter DEM, it worked after I raised the
>> > chunksize(otherwise
>> > it ran out of memory), so I added here just to avoid that issue(probably
>> > it
>> > not work  for this case)
>> >
>> > I am going to install gdalUtils package, however, I need to upgrade my R
>> > from 3.0.1 to 3.2 (I am afraid to do it since some other work is
>> > running).
>> >
>> > However, I have an update for this, i used the gdalwarp in my
>> > Cygwin(actually installed by the QGIS) and run it successfully:
>> > gdalwarp -t_srs '+proj=utm +zone=17 +datum=NAD83 +units=m +no_defs
>> > +ellps=GRS80 +towgs84=0,0,0' -r bilinear -multi -tr 10 10 -of GTiff
>> > grdn41w083_13/hdr.adf n41w083.tif
>> >
>> > it took 11 second to finish.
>> >
>> > I am still looking for the slop calculation based on the projected tif
>> > file,
>> > I will appreciated if anyone could give me some suggestions if have this
>> > experience.
>> >
>> > Thanks a lot.
>> >
>> > Ping
>> >
>> >
>> > On Mon, Apr 7, 2014 at 3:42 PM, Jonathan Greenberg <jgrn at illinois.edu>
>> > wrote:
>> >>
>> >> Ping:
>> >>
>> >> Re: maxmemory -- you really shouldn't adjust that, definitely not to
>> >> what you set it to -- you've set it to (by my quick calculations) to
>> >> be around 7500 GB of needed memory (this is assuming each cell takes
>> >> 64 bits of space) -- this is probably why you are crashing.  Chunksize
>> >> also is usually set correctly, raster processes generally have a
>> >> quickly diminishing returns as you increase the chunk size.
>> >>
>> >> You might want to try gdalwarp in my gdalUtils package -- it will be
>> >> faster to reproject, and you can even enable parallelized warping.
>> >>
>> >> install.packages("gdalUtils",
>> >> repos="http://R-Forge.R-project.org",type="source") # To get the
>> >> latest, otherwise just use CRAN
>> >>
>> >> --j
>> >>
>> >> On Mon, Apr 7, 2014 at 3:30 PM, Alex Zvoleff
>> >> <azvoleff at conservation.org>
>> >> wrote:
>> >> > On Mon, Apr 7, 2014 at 4:05 PM, ping yang <pingyang.whu at gmail.com>
>> >> > wrote:
>> >> >
>> >> >> Dear r-sig-geo,
>> >> >>
>> >> >> I plan to reproject from lonlat to UTM for a bunch of 10 meter NED
>> >> >> files
>> >> >> (large) using a R script(I did it successfully for the 30 meter
>> >> >> DEM), I
>> >> >> used the following code to run:
>> >> >>
>> >> >> rasterOptions(chunksize = 1e+10, maxmemory = 1e+12) #need to allow
>> >> >> bigger
>> >> >> chunksize for the processing!!!
>> >> >>
>> >> >
>> >> > Why do you find that you "need to allow bigger chunksize"? I rarely
>> >> > need
>> >> > to
>> >> > adjust the chunksize, even for large images.
>> >> >
>> >> >
>> >> >>  #*----Function for detect the UTM Zone-----#
>> >> >> long2UTM <- function(long) {
>> >> >>   (floor((long + 180)/6) %% 60) + 1
>> >> >> }
>> >> >>
>> >> >>   require(raster)
>> >> >>   require(rgdal)
>> >> >>   filename <- "grdn41w083_13"
>> >> >>   longitude <- paste("-",substr(filename,9,10),sep="")
>> >> >>   lon <- as.numeric(longitude)
>> >> >>   utm <-paste("+proj=utm +zone=",long2UTM(lon)," +datum=NAD83
>> >> >> +ellps=GRS80
>> >> >> +towgs84=0,0,0 +units=m +no_defs",sep="")
>> >> >>   r  <- raster(filename)
>> >> >>   outputfile <-
>> >> >> paste("projected/",substr(filename,4,10),".tif",sep='')
>> >> >>   print(paste(outputfile))
>> >> >>   pr <- projectRaster(r, crs=utm, res=10.0, method='bilinear')
>> >> >>
>> >> >> However, the projectRaster takes forever to finish and it almost
>> >> >> crashed my
>> >> >> computer.
>> >> >>
>> >> >> The file is kind of big  (file size is 455MB),The file information
>> >> >> is
>> >> >> class       : RasterLayer
>> >> >> dimensions  : 10812, 10812, 116899344  (nrow, ncol, ncell)
>> >> >> resolution  : 9.259259e-05, 9.259259e-05  (x, y)
>> >> >> extent      : -83.00056, -81.99944, 39.99944, 41.00056  (xmin, xmax,
>> >> >> ymin,
>> >> >> ymax)
>> >> >> coord. ref. : +proj=longlat +ellps=GRS80 +towgs84=0,0,0,0,0,0,0
>> >> >> +no_defs
>> >> >> data source : c:\temporary\10M\grdn41w083_13
>> >> >> names       : grdn41w083_13
>> >> >> values      : 212.254, 460.0799  (min, max)
>> >> >>
>> >> >> My R information:
>> >> >> sessionInfo()
>> >> >> R version 3.0.1 (2013-05-16)
>> >> >> Platform: x86_64-w64-mingw32/x64 (64-bit)
>> >> >> attached base packages:
>> >> >> [1] stats     graphics  grDevices utils     datasets  methods   base
>> >> >> other attached packages:
>> >> >> [1] rgdal_0.8-10  raster_2.1-49 sp_1.0-11
>> >> >> loaded via a namespace (and not attached):
>> >> >> [1] grid_3.0.1      lattice_0.20-27 tools_3.0.1
>> >> >>
>> >> >> However, this process only took 10 seconds using the existing tool
>> >> >> in
>> >> >> ArcGIS toolbox. Has someone experienced this before? Any comments
>> >> >> will
>> >> >> be
>> >> >> welcome.
>> >> >>
>> >> >
>> >> > Are you using "Project" or "Define Projection" in ArcGIS? "Define
>> >> > projection" just overwrites the existing projection rather than
>> >> > reprojecting the data, which is much more calculation intensive.
>> >> >
>> >> >
>> >> >>
>> >> >> Thanks,
>> >> >>
>> >> >> Ping
>> >> >>
>> >> >>         [[alternative HTML version deleted]]
>> >> >>
>> >> >> _______________________________________________
>> >> >> R-sig-Geo mailing list
>> >> >> R-sig-Geo at r-project.org
>> >> >> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>> >> >>
>> >> >
>> >> >
>> >> > --
>> >> > Alex Zvoleff
>> >> > Postdoctoral Associate
>> >> > Tropical Ecology Assessment and Monitoring (TEAM) Network
>> >> > Conservation International
>> >> > 2011 Crystal Dr. Suite 500, Arlington, Virginia 22202, USA
>> >> > Tel: +1-703-341-2749, Fax: +1-703-979-0953, Skype: azvoleff
>> >> > http://www.teamnetwork.org | http://www.conservation.org
>> >> >
>> >> >
>> >> > On Mon, Apr 7, 2014 at 4:05 PM, ping yang <pingyang.whu at gmail.com>
>> >> > wrote:
>> >> >
>> >> >> Dear r-sig-geo,
>> >> >>
>> >> >> I plan to reproject from lonlat to UTM for a bunch of 10 meter NED
>> >> >> files
>> >> >> (large) using a R script(I did it successfully for the 30 meter
>> >> >> DEM), I
>> >> >> used the following code to run:
>> >> >>
>> >> >> rasterOptions(chunksize = 1e+10, maxmemory = 1e+12) #need to allow
>> >> >> bigger
>> >> >> chunksize for the processing!!!
>> >> >>  #*----Function for detect the UTM Zone-----#
>> >> >> long2UTM <- function(long) {
>> >> >>   (floor((long + 180)/6) %% 60) + 1
>> >> >> }
>> >> >>
>> >> >>   require(raster)
>> >> >>   require(rgdal)
>> >> >>   filename <- "grdn41w083_13"
>> >> >>   longitude <- paste("-",substr(filename,9,10),sep="")
>> >> >>   lon <- as.numeric(longitude)
>> >> >>   utm <-paste("+proj=utm +zone=",long2UTM(lon)," +datum=NAD83
>> >> >> +ellps=GRS80
>> >> >> +towgs84=0,0,0 +units=m +no_defs",sep="")
>> >> >>   r  <- raster(filename)
>> >> >>   outputfile <-
>> >> >> paste("projected/",substr(filename,4,10),".tif",sep='')
>> >> >>   print(paste(outputfile))
>> >> >>   pr <- projectRaster(r, crs=utm, res=10.0, method='bilinear')
>> >> >>
>> >> >> However, the projectRaster takes forever to finish and it almost
>> >> >> crashed my
>> >> >> computer.
>> >> >>
>> >> >> The file is kind of big  (file size is 455MB),The file information
>> >> >> is
>> >> >> class       : RasterLayer
>> >> >> dimensions  : 10812, 10812, 116899344  (nrow, ncol, ncell)
>> >> >> resolution  : 9.259259e-05, 9.259259e-05  (x, y)
>> >> >> extent      : -83.00056, -81.99944, 39.99944, 41.00056  (xmin, xmax,
>> >> >> ymin,
>> >> >> ymax)
>> >> >> coord. ref. : +proj=longlat +ellps=GRS80 +towgs84=0,0,0,0,0,0,0
>> >> >> +no_defs
>> >> >> data source : c:\temporary\10M\grdn41w083_13
>> >> >> names       : grdn41w083_13
>> >> >> values      : 212.254, 460.0799  (min, max)
>> >> >>
>> >> >> My R information:
>> >> >> sessionInfo()
>> >> >> R version 3.0.1 (2013-05-16)
>> >> >> Platform: x86_64-w64-mingw32/x64 (64-bit)
>> >> >> attached base packages:
>> >> >> [1] stats     graphics  grDevices utils     datasets  methods   base
>> >> >> other attached packages:
>> >> >> [1] rgdal_0.8-10  raster_2.1-49 sp_1.0-11
>> >> >> loaded via a namespace (and not attached):
>> >> >> [1] grid_3.0.1      lattice_0.20-27 tools_3.0.1
>> >> >>
>> >> >> However, this process only took 10 seconds using the existing tool
>> >> >> in
>> >> >> ArcGIS toolbox. Has someone experienced this before? Any comments
>> >> >> will
>> >> >> be
>> >> >> welcome.
>> >> >>
>> >> >> Thanks,
>> >> >>
>> >> >> Ping
>> >> >>
>> >> >>         [[alternative HTML version deleted]]
>> >> >>
>> >> >> _______________________________________________
>> >> >> R-sig-Geo mailing list
>> >> >> R-sig-Geo at r-project.org
>> >> >> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>> >> >>
>> >> >
>> >> >
>> >> >
>> >> > --
>> >> > Alex Zvoleff
>> >> > Postdoctoral Associate
>> >> > Tropical Ecology Assessment and Monitoring (TEAM) Network
>> >> > Conservation International
>> >> > 2011 Crystal Dr. Suite 500, Arlington, Virginia 22202, USA
>> >> > Tel: +1-703-341-2749, Fax: +1-703-979-0953, Skype: azvoleff
>> >> > http://www.teamnetwork.org | http://www.conservation.org
>> >> >
>> >> >         [[alternative HTML version deleted]]
>> >> >
>> >> > _______________________________________________
>> >> > R-sig-Geo mailing list
>> >> > R-sig-Geo at r-project.org
>> >> > https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>> >>
>> >>
>> >>
>> >> --
>> >> Jonathan A. Greenberg, PhD
>> >> Assistant Professor
>> >> Global Environmental Analysis and Remote Sensing (GEARS) Laboratory
>> >> Department of Geography and Geographic Information Science
>> >> University of Illinois at Urbana-Champaign
>> >> 259 Computing Applications Building, MC-150
>> >> 605 East Springfield Avenue
>> >> Champaign, IL  61820-6371
>> >> Phone: 217-300-1924
>> >> http://www.geog.illinois.edu/~jgrn/
>> >> AIM: jgrn307, MSN: jgrn307 at hotmail.com, Gchat: jgrn307, Skype: jgrn3007
>> >
>> >
>>
>>
>>
>> --
>> Jonathan A. Greenberg, PhD
>> Assistant Professor
>> Global Environmental Analysis and Remote Sensing (GEARS) Laboratory
>> Department of Geography and Geographic Information Science
>> University of Illinois at Urbana-Champaign
>> 259 Computing Applications Building, MC-150
>> 605 East Springfield Avenue
>> Champaign, IL  61820-6371
>> Phone: 217-300-1924
>> http://www.geog.illinois.edu/~jgrn/
>> AIM: jgrn307, MSN: jgrn307 at hotmail.com, Gchat: jgrn307, Skype: jgrn3007
>
>



-- 
Jonathan A. Greenberg, PhD
Assistant Professor
Global Environmental Analysis and Remote Sensing (GEARS) Laboratory
Department of Geography and Geographic Information Science
University of Illinois at Urbana-Champaign
259 Computing Applications Building, MC-150
605 East Springfield Avenue
Champaign, IL  61820-6371
Phone: 217-300-1924
http://www.geog.illinois.edu/~jgrn/
AIM: jgrn307, MSN: jgrn307 at hotmail.com, Gchat: jgrn307, Skype: jgrn3007



More information about the R-sig-Geo mailing list