[R-sig-Geo] crop or mask large raster on inport

Roger Bivand Roger@B|v@nd @end|ng |rom nhh@no
Wed Nov 25 10:12:56 CET 2020


On Tue, 24 Nov 2020, Dexter Locke wrote:

> Dear list,
>
> Is there a way to reduce the spatial extent of an impractically-large
> raster (~50Gb) on import? Can a crop or mask be applied as the large raster
> is read into R to create a more manageable subset?

rgdal::GDAL.open() simply opens the data source but does not read it, 
returning an object with a handle to the open file - close with 
rgdal::GDAL.close(). rgdal::getRasterData() has a complete set of 
arguments for the kinds of things you mention, see: 
https://rgdal.r-forge.r-project.org/reference/GDALRasterBand-class.html

The same arguments are used by rgdal::readGDAL(): offset, region.dim, 
output.dim, band. These all permit rectangular cropping, and output.dim= 
can decimate/resample/aggregate input cells to coarser output cells. These 
are the same mechanisms that GDAL utility programs use. They are also used 
by raster internally.

The example in rgdal::readGDAL() includes:

SP27GTIF <- readGDAL(system.file("pictures/SP27GTIF.TIF",
  package = "rgdal")[1], output.dim=c(100,100))

doing resampling on-the-fly. Lower down,

fn <- system.file("pictures/erdas_spnad83.tif", package = "rgdal")[1]
erdas_spnad83 <- readGDAL(fn, offset=c(50, 100), region.dim=c(400, 400),
  output.dim=c(100,100))

uses all of offset=, region.dim= and output.dim=. The example files are 
tiny, but the same GDAL code applies to larger files. Note that the axis 
ordering of the argument vectors is often counter-intuitive and needs some 
patience.

The GDAL utilities can be run file-to-file using the gdalUtils package, 
in addition to being run from the command line.

Otherwise, see the stars package and its proxy=TRUE read functionality, 
which leaves the data on disk until required in possibly subsetted and 
resampled form.

Hope this helps,

Roger

>
> I've looked at raster and rgdal helpfiles, but have not found this
> functionality. This would be akin to the sf::st_read() 's new "query"
> argument that takes SQL statements for filtering records while reading data
> into R.
>
> The ultimate use case is to summarize a large categorical raster into many
> thousands of polygons, and I thought reading small subsets in at a time
> could be looped through or parallelized for summarizing smaller pieces of
> the data. If there are other approaches to zonal functions (like
> equivalents of ArcMap's Tabulate Area) that could also work.
>
> Thanks for your consideration, Dexter
>
> 	[[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-Geo mailing list
> R-sig-Geo using r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>

-- 
Roger Bivand
Department of Economics, Norwegian School of Economics,
Helleveien 30, N-5045 Bergen, Norway.
voice: +47 55 95 93 55; e-mail: Roger.Bivand using nhh.no
https://orcid.org/0000-0003-2392-6140
https://scholar.google.no/citations?user=AWeghB0AAAAJ&hl=en



More information about the R-sig-Geo mailing list