[R-sig-Geo] rgdal: problem reading a bigger raster dataset (R 4.0.0/3.6.3, Ubuntu 20.04)
Roger Bivand
Roger@B|v@nd @end|ng |rom nhh@no
Tue Apr 28 12:10:48 CEST 2020
On Tue, 28 Apr 2020, Thorsten Behrens wrote:
> Michael,
>
> Thanks for the hint, it seems to work! Real-world tests will follow in
> the next few days...
>
> So it definitely seems to be a problem of rgdal. It would be great if it
> could still be solved.
Not rgdal, but your use of it. Try looking at a sequence of
file <- system.file("pictures/SP27GTIF.TIF", package="rgdal")
obj <- GDAL.open(file)
(dims <- dim(obj))
band <- 1
data_vector <- getRasterData(obj, band)
GDAL.close(obj)
str(data_vector)
This does not create any more complicated objects, just a matrix. In some
cases, the rows in the matrix are ordered S -> N, so it may appear the
wrong way up.
rgdal::getRasterData() is lightweight, and has many arguments which may be
helpful. rgdal::readGDAL() is heavyweight, creating a SpatialGridDataFrame
object. This involves much copying of data, but the output object can be
used for example in mapping or analysis without further conversion. My
guess is that rgdal::getRasterData() gives you your matrix directly. Look
at the R code to see how as.is= etc. work (files may include scale and
offset values - recently a user was confused that scale and offset were
"magically" applied to convert Uint16 values to degrees Kelvin on
reading). For example, if as.is == TRUE or scale == 1 and offset == 0, no
copying of the input matrix occurs because it is not converted. If you
could check this route, others following this thread could also benefit;
if I'm wrong, that would also be good to know.
Roger
>
> Best,
>
> Thorsten
>
>
>
> Am 27.04.2020 um 15:58 schrieb Michael Sumner:
>> Try stars it worked for me on a test
>>
>> On Mon., 27 Apr. 2020, 23:54 Thorsten Behrens,
>> <thorsten.m.behrens using gmail.com <mailto:thorsten.m.behrens using gmail.com>>
>> wrote:
>>
>> Roger,
>>
>> thanks a lot for your reply!
>>
>> I have 256GB RAM installed (mentioned it somewhere). And there,
>> all is
>> fine when I run:
>>
>> rDemTest = raster(nrow = 48000, ncol = 72000, ext = extent(c(0, 72000,
>>
>> values(rDemTest) = 1
>>
>> When limiting the memory to about 8GB with
>> ulimit::memory_limit(8000),
>> the max size which can be allocated seems to be around 10000 x
>> 10000px.
>> In this case all tests run fine. Unfortunately it seems to be
>> related to
>> the size of the grid (48000 x 72000) and therefore the problem
>> can't be
>> reproduced on machines with 8GB RAM. For some processing steps I need
>> grids of that size in the memory, which is why I have 256 GB
>> installed.
>>
>> Normally, I use the raster package and not rgdal::readGDAL(). This
>> was
>> just a desperate attempt to find the source of the problem.
>>
>> This is what I use in my code:
>>
>> rDem = raster(sFileNameTiff)
>> mDem = raster::as.matrix(rDem)
>>
>> But maybe this is the same...
>>
>> Any further suggestions are much appreciated!
>>
>> Thanks again!
>>
>> Best,
>>
>> Thorsten
>>
>>
>>
>>
>> Am 27.04.2020 um 14:50 schrieb Roger Bivand:
>> > On Mon, 27 Apr 2020, Thorsten Behrens wrote:
>> >
>> >> Dear all,
>> >>
>> >> my problem is that I want to read a big geotiff raster dataset
>> into R
>> >> and convert it to a matrix, which does not work.
>> >> The file is big but there is sufficient memory. I need all the
>> data
>> >> in the memory at the same time.
>> >>
>> >> The error occurs under R 3.6.3 as well as 4.0.0 using Ubuntu 20.04
>> >> LTS with the latest version of the packages (see session info
>> below)
>> >> and 256GB RAM installed.
>> >>
>> >> When loading the raster dataset using rgdal (via readGDAL or
>> >> raster::readAll) I get the follwoing error in R 4.0.0:
>> >>
>> >> ```
>> >> Error in rgdal::getRasterData(con, offset = offs, region.dim =
>> reg,
>> >> band = object using data@band) :
>> >> long vectors not supported yet: memory.c:3782
>> >> ```
>> >
>> > On a 16GB Fedora platform:
>> >
>> >> library(raster) # 3.1-5
>> >> rDemTest = raster(nrow = 48000, ncol = 72000, ext = extent(c(0,
>> 72000,
>> > 0,
>> > + 48000))) # all fine
>> >> rDemTest
>> > class : RasterLayer
>> > dimensions : 48000, 72000, 3.456e+09 (nrow, ncol, ncell)
>> > resolution : 1, 1 (x, y)
>> > extent : 0, 72000, 0, 48000 (xmin, xmax, ymin, ymax)
>> > crs : NA
>> >
>> >> values(rDemTest) = 1
>> > Error: cannot allocate vector of size 25.7 Gb
>> >
>> > So you are deceiving yourself into thinking that all is fine at
>> this
>> > point. Please try to instantiate an example that can be
>> reproduced on
>> > a machine with 8GB RAM.
>> >
>> > Further note that rgdal::readGDAL() is not how you handle very
>> large
>> > objects in files, and never has been. raster can handle blocks
>> of data
>> > from bands in file; stars and gdalcubes can use proxy=TRUE for the
>> > same purpose. Why did you choose rgdal::readGDAL() when this is not
>> > its purpose?
>> >
>> > You did not say how much RAM is on your platform.
>> >
>> > Roger
>> >
>> >>
>> >> In R 3.6.3 is is "... memory.c:3717"
>> >>
>> >> However, I can load the same file with the tiff package and a
>> file of
>> >> the same size in the native raster package format (*.grd) with the
>> >> raster package but again not with the rgdal package.
>> >>
>> >> gdalinfo (gdalUtils) does not complain (see below). Hence, Even
>> >> Rouault assumes the problem is related to rgdal and not gdal
>> >> (https://github.com/OSGeo/gdal/issues/2442).
>> >>
>> >> Below you find reproducible code, which generates a raster file,
>> >> saves the two formats (.tiff and .grd) and tries to read them with
>> >> the different packages.
>> >>
>> >> Is this a known limitation? Any help is greatly appreciated!
>> >>
>> >> Thanks a lot in advance!
>> >>
>> >> Best wishes and stay healthy,
>> >> Thorsten
>> >>
>> >>
>> >>
>> >> ### Steps to reproduce the problem.
>> >>
>> >> R code:
>> >>
>> >> ```
>> >> library(rgdal) # 1.4-8
>> >> library(raster) # 3.1-5
>> >> library(tiff) # 0.1-5
>> >>
>> >> ## generate and manipulate a big raster dataset
>> >> # - generate
>> >> rDemTest = raster(nrow = 48000, ncol = 72000, ext = extent(c(0,
>> >> 72000, 0, 48000))) # all fine
>> >>
>> >> # - manipulate
>> >> values(rDemTest) = 1 # all fine
>> >>
>> >> # - convert
>> >> mDemTest = raster::as.matrix(rDemTest) # all fine
>> >> str(mDemTest)
>> >>
>> >> ## save a big dataset
>> >>
>> >> # - as raster/gdal
>> >> sFileNameTiff = "BigData.tif"
>> >> writeRaster(rDemTest, sFileNameTiff, "GTiff", overwrite = TRUE,
>> >> NAflag = -9999) # all fine
>> >>
>> >> # - as raster native
>> >> sFileNameNative = "BigData.grd"
>> >> writeRaster(rDemTest, sFileNameNative, "raster", overwrite = TRUE,
>> >> NAflag = -9999) # all fine
>> >>
>> >>
>> >> ## load the big raster datasets with different packages and options
>> >> # - load the tiff data with the gdal package via the raster package
>> >> rDem = raster(sFileNameTiff) # all fine
>> >> extent(rDem) # all fine
>> >> mDem = raster::as.matrix(rDem) # error
>> >> rDem = readAll(rDem) # error
>> >>
>> >> # - load the native raster data with the raster package
>> >> rDem = raster(sFileNameNative) # all fine
>> >> extent(rDem) # all fine
>> >> mDem = raster::as.matrix(rDem) # all fine
>> >> str(mDem)
>> >>
>> >> # - load the tiff data with the tiff package
>> >> mDem = readTIFF(sFileNameTiff) # all fine
>> >> str(mDem)
>> >>
>> >> # - load the tiff data with the gdal package
>> >> sfDem = readGDAL(sFileNameTiff) # error
>> >>
>> >> # - load the native raster data with the gdal package
>> >> sfDem = readGDAL(sFileNameNative) # error
>> >>
>> >> ```
>> >>
>> >>
>> >> ### Startup messages when rgdal is attached (requested by Roger
>> Bivand)
>> >>> library(rgdal)
>> >> rgdal: version: 1.4-8, (SVN revision 845)
>> >> Geospatial Data Abstraction Library extensions to R
>> successfully loaded
>> >> Loaded GDAL runtime: GDAL 3.0.4, released 2020/01/28
>> >> Path to GDAL shared files:
>> >> GDAL binary built with GEOS: TRUE
>> >> Loaded PROJ.4 runtime: Rel. 6.3.1, February 10th, 2020,
>> [PJ_VERSION:
>> >> 631]
>> >> Path to PROJ.4 shared files: (autodetected)
>> >> Linking to sp version: 1.4-1
>> >>
>> >>
>> >> ### Session info
>> >>> sessionInfo()
>> >> R version 4.0.0 (2020-04-24)
>> >> Platform: x86_64-pc-linux-gnu (64-bit)
>> >> Running under: Ubuntu 20.04 LTS
>> >>
>> >> Matrix products: default
>> >> BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
>> >> LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/liblapack.so.3
>> >>
>> >> locale:
>> >> [1] LC_CTYPE=de_DE.UTF-8 LC_NUMERIC=C LC_TIME=de_DE.UTF-8
>> >> [4] LC_COLLATE=de_DE.UTF-8 LC_MONETARY=de_DE.UTF-8
>> >> LC_MESSAGES=de_DE.UTF-8
>> >> [7] LC_PAPER=de_DE.UTF-8 LC_NAME=C LC_ADDRESS=C
>> >> [10] LC_TELEPHONE=C LC_MEASUREMENT=de_DE.UTF-8
>> >> LC_IDENTIFICATION=C
>> >>
>> >> attached base packages:
>> >> [1] stats graphics grDevices utils datasets methods base
>> >>
>> >> other attached packages:
>> >> [1] gdalUtils_2.0.3.2 rgdal_1.4-8 tiff_0.1-5 raster_3.1-5
>> >> sp_1.4-1
>> >>
>> >> loaded via a namespace (and not attached):
>> >> [1] compiler_4.0.0 tools_4.0.0 Rcpp_1.0.4.6
>> >> R.methodsS3_1.8.0 codetools_0.2-16
>> >> [6] grid_4.0.0 iterators_1.0.12 foreach_1.5.0
>> >> R.utils_2.9.2 R.oo_1.23.0
>> >> [11] lattice_0.20-41
>> >>
>> >>
>> >> ### gdalInfo
>> >>> gdalinfo(sFileNameTiff)
>> >> [1] "Driver: GTiff/GeoTIFF"
>> >> [2] "Files: BigData.tif"
>> >> [3] "Size is 72000, 48000"
>> >> [4] "Origin = (0.000000000000000,48000.000000000000000)"
>> >> [5] "Pixel Size = (1.000000000000000,-1.000000000000000)"
>> >> [6] "Image Structure Metadata:"
>> >> [7] " COMPRESSION=LZW"
>> >> [8] " INTERLEAVE=BAND"
>> >> [9] "Corner Coordinates:"
>> >> [10] "Upper Left ( 0.000, 48000.000) "
>> >> [11] "Lower Left ( 0.0000000, 0.0000000) "
>> >> [12] "Upper Right ( 72000.000, 48000.000) "
>> >> [13] "Lower Right ( 72000.000, 0.000) "
>> >> [14] "Center ( 36000.000, 24000.000) "
>> >> [15] "Band 1 Block=72000x1 Type=Float32, ColorInterp=Gray"
>> >> [16] " Min=1.000 Max=1.000 "
>> >> [17] " Minimum=1.000, Maximum=1.000, Mean=nan, StdDev=nan"
>> >> [18] " NoData Value=-9999"
>> >> [19] " Metadata:"
>> >> [20] " STATISTICS_MAXIMUM=1"
>> >> [21] " STATISTICS_MEAN=nan"
>> >> [22] " STATISTICS_MINIMUM=1"
>> >> [23] " STATISTICS_STDDEV=nan"
>> >>
>> >> _______________________________________________
>> >> R-sig-Geo mailing list
>> >> R-sig-Geo using r-project.org <mailto:R-sig-Geo using r-project.org>
>> >> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>> >>
>> >>
>> >
>>
>> _______________________________________________
>> R-sig-Geo mailing list
>> R-sig-Geo using r-project.org <mailto:R-sig-Geo using r-project.org>
>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>>
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-Geo mailing list
> R-sig-Geo using r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>
--
Roger Bivand
Department of Economics, Norwegian School of Economics,
Helleveien 30, N-5045 Bergen, Norway.
voice: +47 55 95 93 55; e-mail: Roger.Bivand using nhh.no
https://orcid.org/0000-0003-2392-6140
https://scholar.google.no/citations?user=AWeghB0AAAAJ&hl=en
More information about the R-sig-Geo
mailing list