[R-sig-Geo] rgdal: problem reading a bigger raster dataset (R 4.0.0/3.6.3, Ubuntu 20.04)

Thorsten Behrens thor@ten@m@behren@ @end|ng |rom gm@||@com
Tue Apr 28 13:50:01 CEST 2020


Mike,

I'll look into terra as well. Overlooked it... Thanks!

My current code uses the raster package to load the datasets, hence I 
would be happy if I do not have to change my code  :-)

I am trying to implement some multi-scale mapping in R/Rcpp (*Behrens, 
T.*, Schmidt, K., MacMillan, R.A., Viscarra Rossel, R. (2018). 
Multi-scale Digital Soil Mapping with deep learning. Scientific Reports 
8,15244.). Reading and writing it piecewise does not really work for 
this (especially for the coarse scales) and would complicate things a lot.

Cheers,

Thorsten


Am 28.04.2020 um 12:12 schrieb Michael Sumner:
> I reckon that's unlikely, it's also unlikely to scale or be fast 
> enough ever  given what rgdal is - raster will happily crop sections 
> of it out but I guess your workflow is to write the geotiff out in one 
> go? You want rgdal because you want raster?  Have you tried terra? (I 
> forgot to mention that one, and I haven't tried on that size GeoTIFF 
> with it. It has its own internal connection to GDAL).
>
> Possibly you can instantiate a GeoTIFF with rgdal of the right size, 
> then write to it piecewise - but if the convenience of raster is what 
> you're after that's obviously no good. I think I'm missing some 
> workflow constraint or motivation you have, but I'm interested (it's 
> an awkward space).
>
> Cheers, Mike
>
> On Tue, Apr 28, 2020 at 7:14 PM Thorsten Behrens 
> <thorsten.m.behrens using gmail.com <mailto:thorsten.m.behrens using gmail.com>> 
> wrote:
>
>     Michael,
>
>     Thanks for the hint, it seems to work! Real-world tests will
>     follow in
>     the next few days...
>
>     So it definitely seems to be a problem of rgdal. It would be great
>     if it
>     could still be solved.
>
>     Best,
>
>     Thorsten
>
>
>
>     Am 27.04.2020 um 15:58 schrieb Michael Sumner:
>     > Try stars it worked for me on a test
>     >
>     > On Mon., 27 Apr. 2020, 23:54 Thorsten Behrens,
>     > <thorsten.m.behrens using gmail.com
>     <mailto:thorsten.m.behrens using gmail.com>
>     <mailto:thorsten.m.behrens using gmail.com
>     <mailto:thorsten.m.behrens using gmail.com>>>
>     > wrote:
>     >
>     >     Roger,
>     >
>     >     thanks a lot for your reply!
>     >
>     >     I have 256GB RAM installed (mentioned it somewhere). And there,
>     >     all is
>     >     fine when I run:
>     >
>     >     rDemTest = raster(nrow = 48000, ncol = 72000, ext =
>     extent(c(0, 72000,
>     >
>     >     values(rDemTest) = 1
>     >
>     >     When limiting the memory to about 8GB with
>     >     ulimit::memory_limit(8000),
>     >     the max size which can be allocated seems to be around 10000 x
>     >     10000px.
>     >     In this case all tests run fine. Unfortunately it seems to be
>     >     related to
>     >     the size of the grid (48000 x 72000) and therefore the problem
>     >     can't be
>     >     reproduced on machines with 8GB RAM. For some processing
>     steps I need
>     >     grids of that size in the memory, which is why I have 256 GB
>     >     installed.
>     >
>     >     Normally, I use the raster package and not
>     rgdal::readGDAL(). This
>     >     was
>     >     just a desperate attempt to find the source of the problem.
>     >
>     >     This is what I use in my code:
>     >
>     >     rDem = raster(sFileNameTiff)
>     >     mDem = raster::as.matrix(rDem)
>     >
>     >     But maybe this is the same...
>     >
>     >     Any further suggestions are much appreciated!
>     >
>     >     Thanks again!
>     >
>     >     Best,
>     >
>     >     Thorsten
>     >
>     >
>     >
>     >
>     >     Am 27.04.2020 um 14:50 schrieb Roger Bivand:
>     >     > On Mon, 27 Apr 2020, Thorsten Behrens wrote:
>     >     >
>     >     >> Dear all,
>     >     >>
>     >     >> my problem is that I want to read a big geotiff raster
>     dataset
>     >     into R
>     >     >> and convert it to a matrix, which does not work.
>     >     >> The file is big but there is sufficient memory. I need
>     all the
>     >     data
>     >     >> in the memory at the same time.
>     >     >>
>     >     >> The error occurs under R 3.6.3 as well as 4.0.0 using
>     Ubuntu 20.04
>     >     >> LTS with the latest version of the packages (see session info
>     >     below)
>     >     >> and 256GB RAM installed.
>     >     >>
>     >     >> When loading the raster dataset using rgdal (via readGDAL or
>     >     >> raster::readAll) I get the follwoing error in R 4.0.0:
>     >     >>
>     >     >> ```
>     >     >> Error in rgdal::getRasterData(con, offset = offs,
>     region.dim =
>     >     reg,
>     >     >> band = object using data@band) :
>     >     >>   long vectors not supported yet: memory.c:3782
>     >     >> ```
>     >     >
>     >     > On a 16GB Fedora platform:
>     >     >
>     >     >> library(raster) # 3.1-5
>     >     >> rDemTest = raster(nrow = 48000, ncol = 72000, ext =
>     extent(c(0,
>     >     72000,
>     >     > 0,
>     >     > + 48000))) # all fine
>     >     >> rDemTest
>     >     > class      : RasterLayer
>     >     > dimensions : 48000, 72000, 3.456e+09  (nrow, ncol, ncell)
>     >     > resolution : 1, 1  (x, y)
>     >     > extent     : 0, 72000, 0, 48000  (xmin, xmax, ymin, ymax)
>     >     > crs        : NA
>     >     >
>     >     >> values(rDemTest) = 1
>     >     > Error: cannot allocate vector of size 25.7 Gb
>     >     >
>     >     > So you are deceiving yourself into thinking that all is
>     fine at
>     >     this
>     >     > point. Please try to instantiate an example that can be
>     >     reproduced on
>     >     > a machine with 8GB RAM.
>     >     >
>     >     > Further note that rgdal::readGDAL() is not how you handle very
>     >     large
>     >     > objects in files, and never has been. raster can handle blocks
>     >     of data
>     >     > from bands in file; stars and gdalcubes can use proxy=TRUE
>     for the
>     >     > same purpose. Why did you choose rgdal::readGDAL() when
>     this is not
>     >     > its purpose?
>     >     >
>     >     > You did not say how much RAM is on your platform.
>     >     >
>     >     > Roger
>     >     >
>     >     >>
>     >     >> In R 3.6.3 is is "... memory.c:3717"
>     >     >>
>     >     >> However, I can load the same file with the tiff package and a
>     >     file of
>     >     >> the same size in the native raster package format (*.grd)
>     with the
>     >     >> raster package but again not with the rgdal package.
>     >     >>
>     >     >> gdalinfo (gdalUtils) does not complain (see below).
>     Hence, Even
>     >     >> Rouault assumes the problem is related to rgdal and not gdal
>     >     >> (https://github.com/OSGeo/gdal/issues/2442).
>     >     >>
>     >     >> Below you find reproducible code, which generates a
>     raster file,
>     >     >> saves the two formats (.tiff and .grd) and tries to read
>     them with
>     >     >> the different packages.
>     >     >>
>     >     >> Is this a known limitation? Any help is greatly appreciated!
>     >     >>
>     >     >> Thanks a lot in advance!
>     >     >>
>     >     >> Best wishes and stay healthy,
>     >     >> Thorsten
>     >     >>
>     >     >>
>     >     >>
>     >     >> ### Steps to reproduce the problem.
>     >     >>
>     >     >> R code:
>     >     >>
>     >     >> ```
>     >     >> library(rgdal) # 1.4-8
>     >     >> library(raster) # 3.1-5
>     >     >> library(tiff) # 0.1-5
>     >     >>
>     >     >> ## generate and manipulate a big raster dataset
>     >     >> # - generate
>     >     >> rDemTest = raster(nrow = 48000, ncol = 72000, ext =
>     extent(c(0,
>     >     >> 72000, 0, 48000))) # all fine
>     >     >>
>     >     >> # - manipulate
>     >     >> values(rDemTest) = 1 # all fine
>     >     >>
>     >     >> # - convert
>     >     >> mDemTest = raster::as.matrix(rDemTest) # all fine
>     >     >> str(mDemTest)
>     >     >>
>     >     >> ## save a big dataset
>     >     >>
>     >     >> # - as raster/gdal
>     >     >> sFileNameTiff = "BigData.tif"
>     >     >> writeRaster(rDemTest, sFileNameTiff, "GTiff", overwrite =
>     TRUE,
>     >     >> NAflag = -9999) # all fine
>     >     >>
>     >     >> # - as raster native
>     >     >> sFileNameNative = "BigData.grd"
>     >     >> writeRaster(rDemTest, sFileNameNative, "raster",
>     overwrite = TRUE,
>     >     >> NAflag = -9999) # all fine
>     >     >>
>     >     >>
>     >     >> ## load the big raster datasets with different packages
>     and options
>     >     >> # - load the tiff data with the gdal package via the
>     raster package
>     >     >> rDem = raster(sFileNameTiff) # all fine
>     >     >> extent(rDem) # all fine
>     >     >> mDem = raster::as.matrix(rDem) # error
>     >     >> rDem = readAll(rDem) # error
>     >     >>
>     >     >> # - load the native raster data with the raster package
>     >     >> rDem = raster(sFileNameNative) # all fine
>     >     >> extent(rDem) # all fine
>     >     >> mDem = raster::as.matrix(rDem) # all fine
>     >     >> str(mDem)
>     >     >>
>     >     >> # - load the tiff data with the tiff package
>     >     >> mDem = readTIFF(sFileNameTiff) # all fine
>     >     >> str(mDem)
>     >     >>
>     >     >> # - load the tiff data with the gdal package
>     >     >> sfDem = readGDAL(sFileNameTiff) # error
>     >     >>
>     >     >> # - load the native raster data with the gdal package
>     >     >> sfDem = readGDAL(sFileNameNative) # error
>     >     >>
>     >     >> ```
>     >     >>
>     >     >>
>     >     >> ### Startup messages when rgdal is attached (requested by
>     Roger
>     >     Bivand)
>     >     >>>  library(rgdal)
>     >     >> rgdal: version: 1.4-8, (SVN revision 845)
>     >     >>  Geospatial Data Abstraction Library extensions to R
>     >     successfully loaded
>     >     >>  Loaded GDAL runtime: GDAL 3.0.4, released 2020/01/28
>     >     >>  Path to GDAL shared files:
>     >     >>  GDAL binary built with GEOS: TRUE
>     >     >>  Loaded PROJ.4 runtime: Rel. 6.3.1, February 10th, 2020,
>     >     [PJ_VERSION:
>     >     >> 631]
>     >     >>  Path to PROJ.4 shared files: (autodetected)
>     >     >>  Linking to sp version: 1.4-1
>     >     >>
>     >     >>
>     >     >> ### Session info
>     >     >>>  sessionInfo()
>     >     >> R version 4.0.0 (2020-04-24)
>     >     >> Platform: x86_64-pc-linux-gnu (64-bit)
>     >     >> Running under: Ubuntu 20.04 LTS
>     >     >>
>     >     >> Matrix products: default
>     >     >> BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
>     >     >> LAPACK:
>     /usr/lib/x86_64-linux-gnu/openblas-pthread/liblapack.so.3
>     >     >>
>     >     >> locale:
>     >     >>  [1] LC_CTYPE=de_DE.UTF-8       LC_NUMERIC=C
>     LC_TIME=de_DE.UTF-8
>     >     >>  [4] LC_COLLATE=de_DE.UTF-8 LC_MONETARY=de_DE.UTF-8
>     >     >> LC_MESSAGES=de_DE.UTF-8
>     >     >>  [7] LC_PAPER=de_DE.UTF-8       LC_NAME=C LC_ADDRESS=C
>     >     >> [10] LC_TELEPHONE=C LC_MEASUREMENT=de_DE.UTF-8
>     >     >> LC_IDENTIFICATION=C
>     >     >>
>     >     >> attached base packages:
>     >     >> [1] stats     graphics  grDevices utils datasets methods base
>     >     >>
>     >     >> other attached packages:
>     >     >> [1] gdalUtils_2.0.3.2 rgdal_1.4-8 tiff_0.1-5 raster_3.1-5
>     >     >> sp_1.4-1
>     >     >>
>     >     >> loaded via a namespace (and not attached):
>     >     >>  [1] compiler_4.0.0    tools_4.0.0 Rcpp_1.0.4.6
>     >     >> R.methodsS3_1.8.0 codetools_0.2-16
>     >     >>  [6] grid_4.0.0        iterators_1.0.12 foreach_1.5.0
>     >     >> R.utils_2.9.2     R.oo_1.23.0
>     >     >> [11] lattice_0.20-41
>     >     >>
>     >     >>
>     >     >> ### gdalInfo
>     >     >>>  gdalinfo(sFileNameTiff)
>     >     >>  [1] "Driver: GTiff/GeoTIFF"
>     >     >>  [2] "Files: BigData.tif"
>     >     >>  [3] "Size is 72000, 48000"
>     >     >>  [4] "Origin = (0.000000000000000,48000.000000000000000)"
>     >     >>  [5] "Pixel Size = (1.000000000000000,-1.000000000000000)"
>     >     >>  [6] "Image Structure Metadata:"
>     >     >>  [7] "  COMPRESSION=LZW"
>     >     >>  [8] "  INTERLEAVE=BAND"
>     >     >>  [9] "Corner Coordinates:"
>     >     >> [10] "Upper Left  (       0.000, 48000.000) "
>     >     >> [11] "Lower Left  (   0.0000000, 0.0000000) "
>     >     >> [12] "Upper Right (   72000.000, 48000.000) "
>     >     >> [13] "Lower Right (   72000.000, 0.000) "
>     >     >> [14] "Center      (   36000.000, 24000.000) "
>     >     >> [15] "Band 1 Block=72000x1 Type=Float32, ColorInterp=Gray"
>     >     >> [16] "  Min=1.000 Max=1.000 "
>     >     >> [17] "  Minimum=1.000, Maximum=1.000, Mean=nan, StdDev=nan"
>     >     >> [18] "  NoData Value=-9999"
>     >     >> [19] "  Metadata:"
>     >     >> [20] "    STATISTICS_MAXIMUM=1"
>     >     >> [21] "    STATISTICS_MEAN=nan"
>     >     >> [22] "    STATISTICS_MINIMUM=1"
>     >     >> [23] "    STATISTICS_STDDEV=nan"
>     >     >>
>     >     >> _______________________________________________
>     >     >> R-sig-Geo mailing list
>     >     >> R-sig-Geo using r-project.org <mailto:R-sig-Geo using r-project.org>
>     <mailto:R-sig-Geo using r-project.org <mailto:R-sig-Geo using r-project.org>>
>     >     >> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>     >     >>
>     >     >>
>     >     >
>     >
>     >     _______________________________________________
>     >     R-sig-Geo mailing list
>     > R-sig-Geo using r-project.org <mailto:R-sig-Geo using r-project.org>
>     <mailto:R-sig-Geo using r-project.org <mailto:R-sig-Geo using r-project.org>>
>     > https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>     >
>
>             [[alternative HTML version deleted]]
>
>     _______________________________________________
>     R-sig-Geo mailing list
>     R-sig-Geo using r-project.org <mailto:R-sig-Geo using r-project.org>
>     https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>
>
>
> -- 
> Michael Sumner
> Software and Database Engineer
> Australian Antarctic Division
> Hobart, Australia
> e-mail: mdsumner using gmail.com <mailto:mdsumner using gmail.com>

	[[alternative HTML version deleted]]



More information about the R-sig-Geo mailing list