[R-sig-Geo] rgdal: problem reading a bigger raster dataset (R 4.0.0/3.6.3, Ubuntu 20.04)
Michael Sumner
md@umner @end|ng |rom gm@||@com
Tue Apr 28 12:12:46 CEST 2020
I reckon that's unlikely, it's also unlikely to scale or be fast enough
ever given what rgdal is - raster will happily crop sections of it out but
I guess your workflow is to write the geotiff out in one go? You want rgdal
because you want raster? Have you tried terra? (I forgot to mention that
one, and I haven't tried on that size GeoTIFF with it. It has its own
internal connection to GDAL).
Possibly you can instantiate a GeoTIFF with rgdal of the right size, then
write to it piecewise - but if the convenience of raster is what you're
after that's obviously no good. I think I'm missing some workflow
constraint or motivation you have, but I'm interested (it's an awkward
space).
Cheers, Mike
On Tue, Apr 28, 2020 at 7:14 PM Thorsten Behrens <
thorsten.m.behrens using gmail.com> wrote:
> Michael,
>
> Thanks for the hint, it seems to work! Real-world tests will follow in
> the next few days...
>
> So it definitely seems to be a problem of rgdal. It would be great if it
> could still be solved.
>
> Best,
>
> Thorsten
>
>
>
> Am 27.04.2020 um 15:58 schrieb Michael Sumner:
> > Try stars it worked for me on a test
> >
> > On Mon., 27 Apr. 2020, 23:54 Thorsten Behrens,
> > <thorsten.m.behrens using gmail.com <mailto:thorsten.m.behrens using gmail.com>>
> > wrote:
> >
> > Roger,
> >
> > thanks a lot for your reply!
> >
> > I have 256GB RAM installed (mentioned it somewhere). And there,
> > all is
> > fine when I run:
> >
> > rDemTest = raster(nrow = 48000, ncol = 72000, ext = extent(c(0,
> 72000,
> >
> > values(rDemTest) = 1
> >
> > When limiting the memory to about 8GB with
> > ulimit::memory_limit(8000),
> > the max size which can be allocated seems to be around 10000 x
> > 10000px.
> > In this case all tests run fine. Unfortunately it seems to be
> > related to
> > the size of the grid (48000 x 72000) and therefore the problem
> > can't be
> > reproduced on machines with 8GB RAM. For some processing steps I need
> > grids of that size in the memory, which is why I have 256 GB
> > installed.
> >
> > Normally, I use the raster package and not rgdal::readGDAL(). This
> > was
> > just a desperate attempt to find the source of the problem.
> >
> > This is what I use in my code:
> >
> > rDem = raster(sFileNameTiff)
> > mDem = raster::as.matrix(rDem)
> >
> > But maybe this is the same...
> >
> > Any further suggestions are much appreciated!
> >
> > Thanks again!
> >
> > Best,
> >
> > Thorsten
> >
> >
> >
> >
> > Am 27.04.2020 um 14:50 schrieb Roger Bivand:
> > > On Mon, 27 Apr 2020, Thorsten Behrens wrote:
> > >
> > >> Dear all,
> > >>
> > >> my problem is that I want to read a big geotiff raster dataset
> > into R
> > >> and convert it to a matrix, which does not work.
> > >> The file is big but there is sufficient memory. I need all the
> > data
> > >> in the memory at the same time.
> > >>
> > >> The error occurs under R 3.6.3 as well as 4.0.0 using Ubuntu 20.04
> > >> LTS with the latest version of the packages (see session info
> > below)
> > >> and 256GB RAM installed.
> > >>
> > >> When loading the raster dataset using rgdal (via readGDAL or
> > >> raster::readAll) I get the follwoing error in R 4.0.0:
> > >>
> > >> ```
> > >> Error in rgdal::getRasterData(con, offset = offs, region.dim =
> > reg,
> > >> band = object using data@band) :
> > >> long vectors not supported yet: memory.c:3782
> > >> ```
> > >
> > > On a 16GB Fedora platform:
> > >
> > >> library(raster) # 3.1-5
> > >> rDemTest = raster(nrow = 48000, ncol = 72000, ext = extent(c(0,
> > 72000,
> > > 0,
> > > + 48000))) # all fine
> > >> rDemTest
> > > class : RasterLayer
> > > dimensions : 48000, 72000, 3.456e+09 (nrow, ncol, ncell)
> > > resolution : 1, 1 (x, y)
> > > extent : 0, 72000, 0, 48000 (xmin, xmax, ymin, ymax)
> > > crs : NA
> > >
> > >> values(rDemTest) = 1
> > > Error: cannot allocate vector of size 25.7 Gb
> > >
> > > So you are deceiving yourself into thinking that all is fine at
> > this
> > > point. Please try to instantiate an example that can be
> > reproduced on
> > > a machine with 8GB RAM.
> > >
> > > Further note that rgdal::readGDAL() is not how you handle very
> > large
> > > objects in files, and never has been. raster can handle blocks
> > of data
> > > from bands in file; stars and gdalcubes can use proxy=TRUE for the
> > > same purpose. Why did you choose rgdal::readGDAL() when this is not
> > > its purpose?
> > >
> > > You did not say how much RAM is on your platform.
> > >
> > > Roger
> > >
> > >>
> > >> In R 3.6.3 is is "... memory.c:3717"
> > >>
> > >> However, I can load the same file with the tiff package and a
> > file of
> > >> the same size in the native raster package format (*.grd) with the
> > >> raster package but again not with the rgdal package.
> > >>
> > >> gdalinfo (gdalUtils) does not complain (see below). Hence, Even
> > >> Rouault assumes the problem is related to rgdal and not gdal
> > >> (https://github.com/OSGeo/gdal/issues/2442).
> > >>
> > >> Below you find reproducible code, which generates a raster file,
> > >> saves the two formats (.tiff and .grd) and tries to read them with
> > >> the different packages.
> > >>
> > >> Is this a known limitation? Any help is greatly appreciated!
> > >>
> > >> Thanks a lot in advance!
> > >>
> > >> Best wishes and stay healthy,
> > >> Thorsten
> > >>
> > >>
> > >>
> > >> ### Steps to reproduce the problem.
> > >>
> > >> R code:
> > >>
> > >> ```
> > >> library(rgdal) # 1.4-8
> > >> library(raster) # 3.1-5
> > >> library(tiff) # 0.1-5
> > >>
> > >> ## generate and manipulate a big raster dataset
> > >> # - generate
> > >> rDemTest = raster(nrow = 48000, ncol = 72000, ext = extent(c(0,
> > >> 72000, 0, 48000))) # all fine
> > >>
> > >> # - manipulate
> > >> values(rDemTest) = 1 # all fine
> > >>
> > >> # - convert
> > >> mDemTest = raster::as.matrix(rDemTest) # all fine
> > >> str(mDemTest)
> > >>
> > >> ## save a big dataset
> > >>
> > >> # - as raster/gdal
> > >> sFileNameTiff = "BigData.tif"
> > >> writeRaster(rDemTest, sFileNameTiff, "GTiff", overwrite = TRUE,
> > >> NAflag = -9999) # all fine
> > >>
> > >> # - as raster native
> > >> sFileNameNative = "BigData.grd"
> > >> writeRaster(rDemTest, sFileNameNative, "raster", overwrite = TRUE,
> > >> NAflag = -9999) # all fine
> > >>
> > >>
> > >> ## load the big raster datasets with different packages and
> options
> > >> # - load the tiff data with the gdal package via the raster
> package
> > >> rDem = raster(sFileNameTiff) # all fine
> > >> extent(rDem) # all fine
> > >> mDem = raster::as.matrix(rDem) # error
> > >> rDem = readAll(rDem) # error
> > >>
> > >> # - load the native raster data with the raster package
> > >> rDem = raster(sFileNameNative) # all fine
> > >> extent(rDem) # all fine
> > >> mDem = raster::as.matrix(rDem) # all fine
> > >> str(mDem)
> > >>
> > >> # - load the tiff data with the tiff package
> > >> mDem = readTIFF(sFileNameTiff) # all fine
> > >> str(mDem)
> > >>
> > >> # - load the tiff data with the gdal package
> > >> sfDem = readGDAL(sFileNameTiff) # error
> > >>
> > >> # - load the native raster data with the gdal package
> > >> sfDem = readGDAL(sFileNameNative) # error
> > >>
> > >> ```
> > >>
> > >>
> > >> ### Startup messages when rgdal is attached (requested by Roger
> > Bivand)
> > >>> library(rgdal)
> > >> rgdal: version: 1.4-8, (SVN revision 845)
> > >> Geospatial Data Abstraction Library extensions to R
> > successfully loaded
> > >> Loaded GDAL runtime: GDAL 3.0.4, released 2020/01/28
> > >> Path to GDAL shared files:
> > >> GDAL binary built with GEOS: TRUE
> > >> Loaded PROJ.4 runtime: Rel. 6.3.1, February 10th, 2020,
> > [PJ_VERSION:
> > >> 631]
> > >> Path to PROJ.4 shared files: (autodetected)
> > >> Linking to sp version: 1.4-1
> > >>
> > >>
> > >> ### Session info
> > >>> sessionInfo()
> > >> R version 4.0.0 (2020-04-24)
> > >> Platform: x86_64-pc-linux-gnu (64-bit)
> > >> Running under: Ubuntu 20.04 LTS
> > >>
> > >> Matrix products: default
> > >> BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
> > >> LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/liblapack.so.3
> > >>
> > >> locale:
> > >> [1] LC_CTYPE=de_DE.UTF-8 LC_NUMERIC=C LC_TIME=de_DE.UTF-8
> > >> [4] LC_COLLATE=de_DE.UTF-8 LC_MONETARY=de_DE.UTF-8
> > >> LC_MESSAGES=de_DE.UTF-8
> > >> [7] LC_PAPER=de_DE.UTF-8 LC_NAME=C LC_ADDRESS=C
> > >> [10] LC_TELEPHONE=C LC_MEASUREMENT=de_DE.UTF-8
> > >> LC_IDENTIFICATION=C
> > >>
> > >> attached base packages:
> > >> [1] stats graphics grDevices utils datasets methods base
> > >>
> > >> other attached packages:
> > >> [1] gdalUtils_2.0.3.2 rgdal_1.4-8 tiff_0.1-5 raster_3.1-5
> > >> sp_1.4-1
> > >>
> > >> loaded via a namespace (and not attached):
> > >> [1] compiler_4.0.0 tools_4.0.0 Rcpp_1.0.4.6
> > >> R.methodsS3_1.8.0 codetools_0.2-16
> > >> [6] grid_4.0.0 iterators_1.0.12 foreach_1.5.0
> > >> R.utils_2.9.2 R.oo_1.23.0
> > >> [11] lattice_0.20-41
> > >>
> > >>
> > >> ### gdalInfo
> > >>> gdalinfo(sFileNameTiff)
> > >> [1] "Driver: GTiff/GeoTIFF"
> > >> [2] "Files: BigData.tif"
> > >> [3] "Size is 72000, 48000"
> > >> [4] "Origin = (0.000000000000000,48000.000000000000000)"
> > >> [5] "Pixel Size = (1.000000000000000,-1.000000000000000)"
> > >> [6] "Image Structure Metadata:"
> > >> [7] " COMPRESSION=LZW"
> > >> [8] " INTERLEAVE=BAND"
> > >> [9] "Corner Coordinates:"
> > >> [10] "Upper Left ( 0.000, 48000.000) "
> > >> [11] "Lower Left ( 0.0000000, 0.0000000) "
> > >> [12] "Upper Right ( 72000.000, 48000.000) "
> > >> [13] "Lower Right ( 72000.000, 0.000) "
> > >> [14] "Center ( 36000.000, 24000.000) "
> > >> [15] "Band 1 Block=72000x1 Type=Float32, ColorInterp=Gray"
> > >> [16] " Min=1.000 Max=1.000 "
> > >> [17] " Minimum=1.000, Maximum=1.000, Mean=nan, StdDev=nan"
> > >> [18] " NoData Value=-9999"
> > >> [19] " Metadata:"
> > >> [20] " STATISTICS_MAXIMUM=1"
> > >> [21] " STATISTICS_MEAN=nan"
> > >> [22] " STATISTICS_MINIMUM=1"
> > >> [23] " STATISTICS_STDDEV=nan"
> > >>
> > >> _______________________________________________
> > >> R-sig-Geo mailing list
> > >> R-sig-Geo using r-project.org <mailto:R-sig-Geo using r-project.org>
> > >> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
> > >>
> > >>
> > >
> >
> > _______________________________________________
> > R-sig-Geo mailing list
> > R-sig-Geo using r-project.org <mailto:R-sig-Geo using r-project.org>
> > https://stat.ethz.ch/mailman/listinfo/r-sig-geo
> >
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-Geo mailing list
> R-sig-Geo using r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>
--
Michael Sumner
Software and Database Engineer
Australian Antarctic Division
Hobart, Australia
e-mail: mdsumner using gmail.com
[[alternative HTML version deleted]]
More information about the R-sig-Geo
mailing list