[R-sig-Geo] rgdal: problem reading a bigger raster dataset (R 4.0.0/3.6.3, Ubuntu 20.04)
Thorsten Behrens
thor@ten@m@behren@ @end|ng |rom gm@||@com
Tue Apr 28 13:39:15 CEST 2020
Roger and Mike,
I really appreciate your help on this!
I had a look at getRasterData(). It results in the same error. Hence, I
made further tests where I compared grids with the following numbers of
cols and rows:
nPx = floor(sqrt(2^31 -1)) # 46340
and
nPx = ceiling(sqrt(2^31 -1)) # 46341
The result is clear. Raster data with 46340 * 46340 px can be loaded
with getRasterData() and with raster(), raster::as.matrix(), whereas
datasets with 46341 * 46341 px cannot and result in the error:
Error in getRasterData(gdCeil, band = 1) :
long vectors not supported yet: memory.c:3782
read_stars() works for both. You find the corresponding code below.
Are there any other option I can try?
Thorsten
Reproducible code:
## generate raster datasets
# 46340 * 46340 grid dataset
sFileNameFloor = "Floor.tif"
nPx = floor(sqrt(2^31 -1)) # 46340
nPx
rFloor = raster(nrow = nPx, ncol = nPx, ext = extent(c(0, nPx, 0, nPx)))
values(rFloor) = 1
writeRaster(rFloor, sFileNameFloor, "GTiff", overwrite = TRUE, NAflag =
-9999)
# 46341 * 46341 grid dataset
sFileNameCeil = "Ceil.tif"
nPx = ceiling(sqrt(2^31 -1))
nPx
rCeil = raster(nrow = nPx, ncol = nPx, ext = extent(c(0, nPx, 0, nPx)))
values(rCeil) = 1
writeRaster(rCeil, sFileNameCeil, "GTiff", overwrite = TRUE, NAflag =
-9999)
## load raster datasets with different methods
# load Ceil
gdCeil = GDAL.open(sFileNameCeil)
dim(gdCeil)
vnCeil = getRasterData(gdCeil, band = 1) # error
GDAL.close(gdCeil)
str(vnCeil)
stCeil = read_stars(sFileNameCeil) # all fine
str(stCeil[[1]])
rCeil = raster(sFileNameCeil)
mCeil = raster::as.matrix(rCeil) # error
str(mCeil)
# load Floor
gdFloor = GDAL.open(sFileNameFloor)
dim(gdFloor)
vnData = getRasterData(gdFloor, band = 1) # all fine
GDAL.close(gdFloor)
str(vnData)
stFloor= read_stars(sFileNameFloor) # all fine
str(stFloor[[1]])
rFloor = raster(sFileNameFloor)
mFloor = raster::as.matrix(rFloor) # all fine
str(mFloor)
Am 28.04.2020 um 12:10 schrieb Roger Bivand:
> On Tue, 28 Apr 2020, Thorsten Behrens wrote:
>
>> Michael,
>>
>> Thanks for the hint, it seems to work! Real-world tests will follow in
>> the next few days...
>>
>> So it definitely seems to be a problem of rgdal. It would be great if it
>> could still be solved.
>
> Not rgdal, but your use of it. Try looking at a sequence of
>
> file <- system.file("pictures/SP27GTIF.TIF", package="rgdal")
> obj <- GDAL.open(file)
> (dims <- dim(obj))
> band <- 1
> data_vector <- getRasterData(obj, band)
> GDAL.close(obj)
> str(data_vector)
>
> This does not create any more complicated objects, just a matrix. In
> some cases, the rows in the matrix are ordered S -> N, so it may
> appear the wrong way up.
>
> rgdal::getRasterData() is lightweight, and has many arguments which
> may be helpful. rgdal::readGDAL() is heavyweight, creating a
> SpatialGridDataFrame object. This involves much copying of data, but
> the output object can be used for example in mapping or analysis
> without further conversion. My guess is that rgdal::getRasterData()
> gives you your matrix directly. Look at the R code to see how as.is=
> etc. work (files may include scale and offset values - recently a user
> was confused that scale and offset were "magically" applied to convert
> Uint16 values to degrees Kelvin on reading). For example, if as.is ==
> TRUE or scale == 1 and offset == 0, no copying of the input matrix
> occurs because it is not converted. If you could check this route,
> others following this thread could also benefit; if I'm wrong, that
> would also be good to know.
>
> Roger
>
>
>>
>> Best,
>>
>> Thorsten
>>
>>
>>
>> Am 27.04.2020 um 15:58 schrieb Michael Sumner:
>>> Try stars it worked for me on a test
>>>
>>> On Mon., 27 Apr. 2020, 23:54 Thorsten Behrens,
>>> <thorsten.m.behrens using gmail.com <mailto:thorsten.m.behrens using gmail.com>>
>>> wrote:
>>>
>>> Roger,
>>>
>>> thanks a lot for your reply!
>>>
>>> I have 256GB RAM installed (mentioned it somewhere). And there,
>>> all is
>>> fine when I run:
>>>
>>> rDemTest = raster(nrow = 48000, ncol = 72000, ext = extent(c(0,
>>> 72000,
>>>
>>> values(rDemTest) = 1
>>>
>>> When limiting the memory to about 8GB with
>>> ulimit::memory_limit(8000),
>>> the max size which can be allocated seems to be around 10000 x
>>> 10000px.
>>> In this case all tests run fine. Unfortunately it seems to be
>>> related to
>>> the size of the grid (48000 x 72000) and therefore the problem
>>> can't be
>>> reproduced on machines with 8GB RAM. For some processing steps I
>>> need
>>> grids of that size in the memory, which is why I have 256 GB
>>> installed.
>>>
>>> Normally, I use the raster package and not rgdal::readGDAL(). This
>>> was
>>> just a desperate attempt to find the source of the problem.
>>>
>>> This is what I use in my code:
>>>
>>> rDem = raster(sFileNameTiff)
>>> mDem = raster::as.matrix(rDem)
>>>
>>> But maybe this is the same...
>>>
>>> Any further suggestions are much appreciated!
>>>
>>> Thanks again!
>>>
>>> Best,
>>>
>>> Thorsten
>>>
>>>
>>>
>>>
>>> Am 27.04.2020 um 14:50 schrieb Roger Bivand:
>>> > On Mon, 27 Apr 2020, Thorsten Behrens wrote:
>>> >
>>> >> Dear all,
>>> >>
>>> >> my problem is that I want to read a big geotiff raster dataset
>>> into R
>>> >> and convert it to a matrix, which does not work.
>>> >> The file is big but there is sufficient memory. I need all the
>>> data
>>> >> in the memory at the same time.
>>> >>
>>> >> The error occurs under R 3.6.3 as well as 4.0.0 using Ubuntu
>>> 20.04
>>> >> LTS with the latest version of the packages (see session info
>>> below)
>>> >> and 256GB RAM installed.
>>> >>
>>> >> When loading the raster dataset using rgdal (via readGDAL or
>>> >> raster::readAll) I get the follwoing error in R 4.0.0:
>>> >>
>>> >> ```
>>> >> Error in rgdal::getRasterData(con, offset = offs, region.dim =
>>> reg,
>>> >> band = object using data@band) :
>>> >> long vectors not supported yet: memory.c:3782
>>> >> ```
>>> >
>>> > On a 16GB Fedora platform:
>>> >
>>> >> library(raster) # 3.1-5
>>> >> rDemTest = raster(nrow = 48000, ncol = 72000, ext = extent(c(0,
>>> 72000,
>>> > 0,
>>> > + 48000))) # all fine
>>> >> rDemTest
>>> > class : RasterLayer
>>> > dimensions : 48000, 72000, 3.456e+09 (nrow, ncol, ncell)
>>> > resolution : 1, 1 (x, y)
>>> > extent : 0, 72000, 0, 48000 (xmin, xmax, ymin, ymax)
>>> > crs : NA
>>> >
>>> >> values(rDemTest) = 1
>>> > Error: cannot allocate vector of size 25.7 Gb
>>> >
>>> > So you are deceiving yourself into thinking that all is fine at
>>> this
>>> > point. Please try to instantiate an example that can be
>>> reproduced on
>>> > a machine with 8GB RAM.
>>> >
>>> > Further note that rgdal::readGDAL() is not how you handle very
>>> large
>>> > objects in files, and never has been. raster can handle blocks
>>> of data
>>> > from bands in file; stars and gdalcubes can use proxy=TRUE for the
>>> > same purpose. Why did you choose rgdal::readGDAL() when this is
>>> not
>>> > its purpose?
>>> >
>>> > You did not say how much RAM is on your platform.
>>> >
>>> > Roger
>>> >
>>> >>
>>> >> In R 3.6.3 is is "... memory.c:3717"
>>> >>
>>> >> However, I can load the same file with the tiff package and a
>>> file of
>>> >> the same size in the native raster package format (*.grd) with
>>> the
>>> >> raster package but again not with the rgdal package.
>>> >>
>>> >> gdalinfo (gdalUtils) does not complain (see below). Hence, Even
>>> >> Rouault assumes the problem is related to rgdal and not gdal
>>> >> (https://github.com/OSGeo/gdal/issues/2442).
>>> >>
>>> >> Below you find reproducible code, which generates a raster file,
>>> >> saves the two formats (.tiff and .grd) and tries to read them
>>> with
>>> >> the different packages.
>>> >>
>>> >> Is this a known limitation? Any help is greatly appreciated!
>>> >>
>>> >> Thanks a lot in advance!
>>> >>
>>> >> Best wishes and stay healthy,
>>> >> Thorsten
>>> >>
>>> >>
>>> >>
>>> >> ### Steps to reproduce the problem.
>>> >>
>>> >> R code:
>>> >>
>>> >> ```
>>> >> library(rgdal) # 1.4-8
>>> >> library(raster) # 3.1-5
>>> >> library(tiff) # 0.1-5
>>> >>
>>> >> ## generate and manipulate a big raster dataset
>>> >> # - generate
>>> >> rDemTest = raster(nrow = 48000, ncol = 72000, ext = extent(c(0,
>>> >> 72000, 0, 48000))) # all fine
>>> >>
>>> >> # - manipulate
>>> >> values(rDemTest) = 1 # all fine
>>> >>
>>> >> # - convert
>>> >> mDemTest = raster::as.matrix(rDemTest) # all fine
>>> >> str(mDemTest)
>>> >>
>>> >> ## save a big dataset
>>> >>
>>> >> # - as raster/gdal
>>> >> sFileNameTiff = "BigData.tif"
>>> >> writeRaster(rDemTest, sFileNameTiff, "GTiff", overwrite = TRUE,
>>> >> NAflag = -9999) # all fine
>>> >>
>>> >> # - as raster native
>>> >> sFileNameNative = "BigData.grd"
>>> >> writeRaster(rDemTest, sFileNameNative, "raster", overwrite =
>>> TRUE,
>>> >> NAflag = -9999) # all fine
>>> >>
>>> >>
>>> >> ## load the big raster datasets with different packages and
>>> options
>>> >> # - load the tiff data with the gdal package via the raster
>>> package
>>> >> rDem = raster(sFileNameTiff) # all fine
>>> >> extent(rDem) # all fine
>>> >> mDem = raster::as.matrix(rDem) # error
>>> >> rDem = readAll(rDem) # error
>>> >>
>>> >> # - load the native raster data with the raster package
>>> >> rDem = raster(sFileNameNative) # all fine
>>> >> extent(rDem) # all fine
>>> >> mDem = raster::as.matrix(rDem) # all fine
>>> >> str(mDem)
>>> >>
>>> >> # - load the tiff data with the tiff package
>>> >> mDem = readTIFF(sFileNameTiff) # all fine
>>> >> str(mDem)
>>> >>
>>> >> # - load the tiff data with the gdal package
>>> >> sfDem = readGDAL(sFileNameTiff) # error
>>> >>
>>> >> # - load the native raster data with the gdal package
>>> >> sfDem = readGDAL(sFileNameNative) # error
>>> >>
>>> >> ```
>>> >>
>>> >>
>>> >> ### Startup messages when rgdal is attached (requested by Roger
>>> Bivand)
>>> >>> library(rgdal)
>>> >> rgdal: version: 1.4-8, (SVN revision 845)
>>> >> Geospatial Data Abstraction Library extensions to R
>>> successfully loaded
>>> >> Loaded GDAL runtime: GDAL 3.0.4, released 2020/01/28
>>> >> Path to GDAL shared files:
>>> >> GDAL binary built with GEOS: TRUE
>>> >> Loaded PROJ.4 runtime: Rel. 6.3.1, February 10th, 2020,
>>> [PJ_VERSION:
>>> >> 631]
>>> >> Path to PROJ.4 shared files: (autodetected)
>>> >> Linking to sp version: 1.4-1
>>> >>
>>> >>
>>> >> ### Session info
>>> >>> sessionInfo()
>>> >> R version 4.0.0 (2020-04-24)
>>> >> Platform: x86_64-pc-linux-gnu (64-bit)
>>> >> Running under: Ubuntu 20.04 LTS
>>> >>
>>> >> Matrix products: default
>>> >> BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
>>> >> LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/liblapack.so.3
>>> >>
>>> >> locale:
>>> >> [1] LC_CTYPE=de_DE.UTF-8 LC_NUMERIC=C LC_TIME=de_DE.UTF-8
>>> >> [4] LC_COLLATE=de_DE.UTF-8 LC_MONETARY=de_DE.UTF-8
>>> >> LC_MESSAGES=de_DE.UTF-8
>>> >> [7] LC_PAPER=de_DE.UTF-8 LC_NAME=C LC_ADDRESS=C
>>> >> [10] LC_TELEPHONE=C LC_MEASUREMENT=de_DE.UTF-8
>>> >> LC_IDENTIFICATION=C
>>> >>
>>> >> attached base packages:
>>> >> [1] stats graphics grDevices utils datasets methods base
>>> >>
>>> >> other attached packages:
>>> >> [1] gdalUtils_2.0.3.2 rgdal_1.4-8 tiff_0.1-5 raster_3.1-5
>>> >> sp_1.4-1
>>> >>
>>> >> loaded via a namespace (and not attached):
>>> >> [1] compiler_4.0.0 tools_4.0.0 Rcpp_1.0.4.6
>>> >> R.methodsS3_1.8.0 codetools_0.2-16
>>> >> [6] grid_4.0.0 iterators_1.0.12 foreach_1.5.0
>>> >> R.utils_2.9.2 R.oo_1.23.0
>>> >> [11] lattice_0.20-41
>>> >>
>>> >>
>>> >> ### gdalInfo
>>> >>> gdalinfo(sFileNameTiff)
>>> >> [1] "Driver: GTiff/GeoTIFF"
>>> >> [2] "Files: BigData.tif"
>>> >> [3] "Size is 72000, 48000"
>>> >> [4] "Origin = (0.000000000000000,48000.000000000000000)"
>>> >> [5] "Pixel Size = (1.000000000000000,-1.000000000000000)"
>>> >> [6] "Image Structure Metadata:"
>>> >> [7] " COMPRESSION=LZW"
>>> >> [8] " INTERLEAVE=BAND"
>>> >> [9] "Corner Coordinates:"
>>> >> [10] "Upper Left ( 0.000, 48000.000) "
>>> >> [11] "Lower Left ( 0.0000000, 0.0000000) "
>>> >> [12] "Upper Right ( 72000.000, 48000.000) "
>>> >> [13] "Lower Right ( 72000.000, 0.000) "
>>> >> [14] "Center ( 36000.000, 24000.000) "
>>> >> [15] "Band 1 Block=72000x1 Type=Float32, ColorInterp=Gray"
>>> >> [16] " Min=1.000 Max=1.000 "
>>> >> [17] " Minimum=1.000, Maximum=1.000, Mean=nan, StdDev=nan"
>>> >> [18] " NoData Value=-9999"
>>> >> [19] " Metadata:"
>>> >> [20] " STATISTICS_MAXIMUM=1"
>>> >> [21] " STATISTICS_MEAN=nan"
>>> >> [22] " STATISTICS_MINIMUM=1"
>>> >> [23] " STATISTICS_STDDEV=nan"
>>> >>
>>> >> _______________________________________________
>>> >> R-sig-Geo mailing list
>>> >> R-sig-Geo using r-project.org <mailto:R-sig-Geo using r-project.org>
>>> >> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>>> >>
>>> >>
>>> >
>>>
>>> _______________________________________________
>>> R-sig-Geo mailing list
>>> R-sig-Geo using r-project.org <mailto:R-sig-Geo using r-project.org>
>>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>>>
>>
>> [[alternative HTML version deleted]]
>>
>> _______________________________________________
>> R-sig-Geo mailing list
>> R-sig-Geo using r-project.org
>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>>
>
More information about the R-sig-Geo
mailing list