[R-sig-Geo] rgdal: problem reading a bigger raster dataset (R 4.0.0/3.6.3, Ubuntu 20.04)

Michael Sumner md@umner @end|ng |rom gm@||@com
Tue Apr 28 12:17:36 CEST 2020


Oh nice one, Roger I'd lost sight of all that churn there.

Cheers, Mike


On Tue, Apr 28, 2020 at 8:11 PM Roger Bivand <Roger.Bivand using nhh.no> wrote:

> On Tue, 28 Apr 2020, Thorsten Behrens wrote:
>
> > Michael,
> >
> > Thanks for the hint, it seems to work! Real-world tests will follow in
> > the next few days...
> >
> > So it definitely seems to be a problem of rgdal. It would be great if it
> > could still be solved.
>
> Not rgdal, but your use of it. Try looking at a sequence of
>
> file <- system.file("pictures/SP27GTIF.TIF", package="rgdal")
> obj <- GDAL.open(file)
> (dims <- dim(obj))
> band <- 1
> data_vector <- getRasterData(obj, band)
> GDAL.close(obj)
> str(data_vector)
>
> This does not create any more complicated objects, just a matrix. In some
> cases, the rows in the matrix are ordered S -> N, so it may appear the
> wrong way up.
>
> rgdal::getRasterData() is lightweight, and has many arguments which may be
> helpful. rgdal::readGDAL() is heavyweight, creating a SpatialGridDataFrame
> object. This involves much copying of data, but the output object can be
> used for example in mapping or analysis without further conversion. My
> guess is that rgdal::getRasterData() gives you your matrix directly. Look
> at the R code to see how as.is= etc. work (files may include scale and
> offset values - recently a user was confused that scale and offset were
> "magically" applied to convert Uint16 values to degrees Kelvin on
> reading). For example, if as.is == TRUE or scale == 1 and offset == 0, no
> copying of the input matrix occurs because it is not converted. If you
> could check this route, others following this thread could also benefit;
> if I'm wrong, that would also be good to know.
>
> Roger
>
>
> >
> > Best,
> >
> > Thorsten
> >
> >
> >
> > Am 27.04.2020 um 15:58 schrieb Michael Sumner:
> >> Try stars it worked for me on a test
> >>
> >> On Mon., 27 Apr. 2020, 23:54 Thorsten Behrens,
> >> <thorsten.m.behrens using gmail.com <mailto:thorsten.m.behrens using gmail.com>>
> >> wrote:
> >>
> >>     Roger,
> >>
> >>     thanks a lot for your reply!
> >>
> >>     I have 256GB RAM installed (mentioned it somewhere). And there,
> >>     all is
> >>     fine when I run:
> >>
> >>     rDemTest = raster(nrow = 48000, ncol = 72000, ext = extent(c(0,
> 72000,
> >>
> >>     values(rDemTest) = 1
> >>
> >>     When limiting the memory to about 8GB with
> >>     ulimit::memory_limit(8000),
> >>     the max size which can be allocated seems to be around 10000 x
> >>     10000px.
> >>     In this case all tests run fine. Unfortunately it seems to be
> >>     related to
> >>     the size of the grid (48000 x 72000) and therefore the problem
> >>     can't be
> >>     reproduced on machines with 8GB RAM. For some processing steps I
> need
> >>     grids of that size in the memory, which is why I have 256 GB
> >>     installed.
> >>
> >>     Normally, I use the raster package and not rgdal::readGDAL(). This
> >>     was
> >>     just a desperate attempt to find the source of the problem.
> >>
> >>     This is what I use in my code:
> >>
> >>     rDem = raster(sFileNameTiff)
> >>     mDem = raster::as.matrix(rDem)
> >>
> >>     But maybe this is the same...
> >>
> >>     Any further suggestions are much appreciated!
> >>
> >>     Thanks again!
> >>
> >>     Best,
> >>
> >>     Thorsten
> >>
> >>
> >>
> >>
> >>     Am 27.04.2020 um 14:50 schrieb Roger Bivand:
> >>    > On Mon, 27 Apr 2020, Thorsten Behrens wrote:
> >>    >
> >>    >> Dear all,
> >>    >>
> >>    >> my problem is that I want to read a big geotiff raster dataset
> >>     into R
> >>    >> and convert it to a matrix, which does not work.
> >>    >> The file is big but there is sufficient memory. I need all the
> >>     data
> >>    >> in the memory at the same time.
> >>    >>
> >>    >> The error occurs under R 3.6.3 as well as 4.0.0 using Ubuntu 20.04
> >>    >> LTS with the latest version of the packages (see session info
> >>     below)
> >>    >> and 256GB RAM installed.
> >>    >>
> >>    >> When loading the raster dataset using rgdal (via readGDAL or
> >>    >> raster::readAll) I get the follwoing error in R 4.0.0:
> >>    >>
> >>    >> ```
> >>    >> Error in rgdal::getRasterData(con, offset = offs, region.dim =
> >>     reg,
> >>    >> band = object using data@band) :
> >>    >>   long vectors not supported yet: memory.c:3782
> >>    >> ```
> >>    >
> >>    > On a 16GB Fedora platform:
> >>    >
> >>    >> library(raster) # 3.1-5
> >>    >> rDemTest = raster(nrow = 48000, ncol = 72000, ext = extent(c(0,
> >>     72000,
> >>    > 0,
> >>    > + 48000))) # all fine
> >>    >> rDemTest
> >>    > class      : RasterLayer
> >>    > dimensions : 48000, 72000, 3.456e+09  (nrow, ncol, ncell)
> >>    > resolution : 1, 1  (x, y)
> >>    > extent     : 0, 72000, 0, 48000  (xmin, xmax, ymin, ymax)
> >>    > crs        : NA
> >>    >
> >>    >> values(rDemTest) = 1
> >>    > Error: cannot allocate vector of size 25.7 Gb
> >>    >
> >>    > So you are deceiving yourself into thinking that all is fine at
> >>     this
> >>    > point. Please try to instantiate an example that can be
> >>     reproduced on
> >>    > a machine with 8GB RAM.
> >>    >
> >>    > Further note that rgdal::readGDAL() is not how you handle very
> >>     large
> >>    > objects in files, and never has been. raster can handle blocks
> >>     of data
> >>    > from bands in file; stars and gdalcubes can use proxy=TRUE for the
> >>    > same purpose. Why did you choose rgdal::readGDAL() when this is not
> >>    > its purpose?
> >>    >
> >>    > You did not say how much RAM is on your platform.
> >>    >
> >>    > Roger
> >>    >
> >>    >>
> >>    >> In R 3.6.3 is is "... memory.c:3717"
> >>    >>
> >>    >> However, I can load the same file with the tiff package and a
> >>     file of
> >>    >> the same size in the native raster package format (*.grd) with the
> >>    >> raster package but again not with the rgdal package.
> >>    >>
> >>    >> gdalinfo (gdalUtils) does not complain (see below). Hence, Even
> >>    >> Rouault assumes the problem is related to rgdal and not gdal
> >>    >> (https://github.com/OSGeo/gdal/issues/2442).
> >>    >>
> >>    >> Below you find reproducible code, which generates a raster file,
> >>    >> saves the two formats (.tiff and .grd) and tries to read them with
> >>    >> the different packages.
> >>    >>
> >>    >> Is this a known limitation? Any help is greatly appreciated!
> >>    >>
> >>    >> Thanks a lot in advance!
> >>    >>
> >>    >> Best wishes and stay healthy,
> >>    >> Thorsten
> >>    >>
> >>    >>
> >>    >>
> >>    >> ### Steps to reproduce the problem.
> >>    >>
> >>    >> R code:
> >>    >>
> >>    >> ```
> >>    >> library(rgdal) # 1.4-8
> >>    >> library(raster) # 3.1-5
> >>    >> library(tiff) # 0.1-5
> >>    >>
> >>    >> ## generate and manipulate a big raster dataset
> >>    >> # - generate
> >>    >> rDemTest = raster(nrow = 48000, ncol = 72000, ext = extent(c(0,
> >>    >> 72000, 0, 48000))) # all fine
> >>    >>
> >>    >> # - manipulate
> >>    >> values(rDemTest) = 1 # all fine
> >>    >>
> >>    >> # - convert
> >>    >> mDemTest = raster::as.matrix(rDemTest) # all fine
> >>    >> str(mDemTest)
> >>    >>
> >>    >> ## save a big dataset
> >>    >>
> >>    >> # - as raster/gdal
> >>    >> sFileNameTiff = "BigData.tif"
> >>    >> writeRaster(rDemTest, sFileNameTiff, "GTiff", overwrite = TRUE,
> >>    >> NAflag = -9999) # all fine
> >>    >>
> >>    >> # - as raster native
> >>    >> sFileNameNative = "BigData.grd"
> >>    >> writeRaster(rDemTest, sFileNameNative, "raster", overwrite = TRUE,
> >>    >> NAflag = -9999) # all fine
> >>    >>
> >>    >>
> >>    >> ## load the big raster datasets with different packages and
> options
> >>    >> # - load the tiff data with the gdal package via the raster
> package
> >>    >> rDem = raster(sFileNameTiff) # all fine
> >>    >> extent(rDem) # all fine
> >>    >> mDem = raster::as.matrix(rDem) # error
> >>    >> rDem = readAll(rDem) # error
> >>    >>
> >>    >> # - load the native raster data with the raster package
> >>    >> rDem = raster(sFileNameNative) # all fine
> >>    >> extent(rDem) # all fine
> >>    >> mDem = raster::as.matrix(rDem) # all fine
> >>    >> str(mDem)
> >>    >>
> >>    >> # - load the tiff data with the tiff package
> >>    >> mDem = readTIFF(sFileNameTiff) # all fine
> >>    >> str(mDem)
> >>    >>
> >>    >> # - load the tiff data with the gdal package
> >>    >> sfDem = readGDAL(sFileNameTiff) # error
> >>    >>
> >>    >> # - load the native raster data with the gdal package
> >>    >> sfDem = readGDAL(sFileNameNative) # error
> >>    >>
> >>    >> ```
> >>    >>
> >>    >>
> >>    >> ### Startup messages when rgdal is attached (requested by Roger
> >>     Bivand)
> >>    >>>  library(rgdal)
> >>    >> rgdal: version: 1.4-8, (SVN revision 845)
> >>    >>  Geospatial Data Abstraction Library extensions to R
> >>     successfully loaded
> >>    >>  Loaded GDAL runtime: GDAL 3.0.4, released 2020/01/28
> >>    >>  Path to GDAL shared files:
> >>    >>  GDAL binary built with GEOS: TRUE
> >>    >>  Loaded PROJ.4 runtime: Rel. 6.3.1, February 10th, 2020,
> >>     [PJ_VERSION:
> >>    >> 631]
> >>    >>  Path to PROJ.4 shared files: (autodetected)
> >>    >>  Linking to sp version: 1.4-1
> >>    >>
> >>    >>
> >>    >> ### Session info
> >>    >>>  sessionInfo()
> >>    >> R version 4.0.0 (2020-04-24)
> >>    >> Platform: x86_64-pc-linux-gnu (64-bit)
> >>    >> Running under: Ubuntu 20.04 LTS
> >>    >>
> >>    >> Matrix products: default
> >>    >> BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
> >>    >> LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/liblapack.so.3
> >>    >>
> >>    >> locale:
> >>    >>  [1] LC_CTYPE=de_DE.UTF-8       LC_NUMERIC=C LC_TIME=de_DE.UTF-8
> >>    >>  [4] LC_COLLATE=de_DE.UTF-8 LC_MONETARY=de_DE.UTF-8
> >>    >> LC_MESSAGES=de_DE.UTF-8
> >>    >>  [7] LC_PAPER=de_DE.UTF-8       LC_NAME=C LC_ADDRESS=C
> >>    >> [10] LC_TELEPHONE=C LC_MEASUREMENT=de_DE.UTF-8
> >>    >> LC_IDENTIFICATION=C
> >>    >>
> >>    >> attached base packages:
> >>    >> [1] stats     graphics  grDevices utils     datasets methods base
> >>    >>
> >>    >> other attached packages:
> >>    >> [1] gdalUtils_2.0.3.2 rgdal_1.4-8       tiff_0.1-5 raster_3.1-5
> >>    >> sp_1.4-1
> >>    >>
> >>    >> loaded via a namespace (and not attached):
> >>    >>  [1] compiler_4.0.0    tools_4.0.0       Rcpp_1.0.4.6
> >>    >> R.methodsS3_1.8.0 codetools_0.2-16
> >>    >>  [6] grid_4.0.0        iterators_1.0.12 foreach_1.5.0
> >>    >> R.utils_2.9.2     R.oo_1.23.0
> >>    >> [11] lattice_0.20-41
> >>    >>
> >>    >>
> >>    >> ### gdalInfo
> >>    >>>  gdalinfo(sFileNameTiff)
> >>    >>  [1] "Driver: GTiff/GeoTIFF"
> >>    >>  [2] "Files: BigData.tif"
> >>    >>  [3] "Size is 72000, 48000"
> >>    >>  [4] "Origin = (0.000000000000000,48000.000000000000000)"
> >>    >>  [5] "Pixel Size = (1.000000000000000,-1.000000000000000)"
> >>    >>  [6] "Image Structure Metadata:"
> >>    >>  [7] "  COMPRESSION=LZW"
> >>    >>  [8] "  INTERLEAVE=BAND"
> >>    >>  [9] "Corner Coordinates:"
> >>    >> [10] "Upper Left  (       0.000,   48000.000) "
> >>    >> [11] "Lower Left  (   0.0000000,   0.0000000) "
> >>    >> [12] "Upper Right (   72000.000,   48000.000) "
> >>    >> [13] "Lower Right (   72000.000,       0.000) "
> >>    >> [14] "Center      (   36000.000,   24000.000) "
> >>    >> [15] "Band 1 Block=72000x1 Type=Float32, ColorInterp=Gray"
> >>    >> [16] "  Min=1.000 Max=1.000 "
> >>    >> [17] "  Minimum=1.000, Maximum=1.000, Mean=nan, StdDev=nan"
> >>    >> [18] "  NoData Value=-9999"
> >>    >> [19] "  Metadata:"
> >>    >> [20] "    STATISTICS_MAXIMUM=1"
> >>    >> [21] "    STATISTICS_MEAN=nan"
> >>    >> [22] "    STATISTICS_MINIMUM=1"
> >>    >> [23] "    STATISTICS_STDDEV=nan"
> >>    >>
> >>    >> _______________________________________________
> >>    >> R-sig-Geo mailing list
> >>    >> R-sig-Geo using r-project.org <mailto:R-sig-Geo using r-project.org>
> >>    >> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
> >>    >>
> >>    >>
> >>    >
> >>
> >>     _______________________________________________
> >>     R-sig-Geo mailing list
> >>     R-sig-Geo using r-project.org <mailto:R-sig-Geo using r-project.org>
> >>     https://stat.ethz.ch/mailman/listinfo/r-sig-geo
> >>
> >
> >       [[alternative HTML version deleted]]
> >
> > _______________________________________________
> > R-sig-Geo mailing list
> > R-sig-Geo using r-project.org
> > https://stat.ethz.ch/mailman/listinfo/r-sig-geo
> >
>
> --
> Roger Bivand
> Department of Economics, Norwegian School of Economics,
> Helleveien 30, N-5045 Bergen, Norway.
> voice: +47 55 95 93 55; e-mail: Roger.Bivand using nhh.no
> https://orcid.org/0000-0003-2392-6140
> https://scholar.google.no/citations?user=AWeghB0AAAAJ&hl=en
> _______________________________________________
> R-sig-Geo mailing list
> R-sig-Geo using r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>


-- 
Michael Sumner
Software and Database Engineer
Australian Antarctic Division
Hobart, Australia
e-mail: mdsumner using gmail.com

	[[alternative HTML version deleted]]



More information about the R-sig-Geo mailing list