[R-sig-Geo] [R] Error: cons memory exhausted (limit reached?): Memory Management?

Christopher Lloyd chr|@||oyd2 @end|ng |rom y@hoo@co@uk
Wed Jan 20 16:09:59 CET 2021


 Thanks for the tips, Roger. I shall look into the possibilities that you discuss. Best wishes, Chris
    On Wednesday, 20 January 2021, 10:57:26 GMT, Roger Bivand <roger.bivand using nhh.no> wrote:  
 
 On Wed, 20 Jan 2021, Christopher Lloyd via R-sig-Geo wrote:

> Hi all,

> I'm loading a large (~30GB) geojson file into R using readOGR on a HPC. 
> I am also loading a small shapefile, and then trying to undertake some 
> processing on the large geojson using gBuffer from the rgeos package.

Have you tried to use the sf package instead?

> I believe that the HPC is running Red Hat Enterprise Linux 7.4, and it 
> certainly has around 750 GB free for user jobs. I have allocated the 
> full amount of ram to the job.I previously used the following modules to 
> undertake this task and it ran successfully, although only after 
> tweaking the settings that I detail below - otherwise I had the same 
> error: module load proj/5.0.0module load gdal/2.3.1module load 
> geos/3.6.2module load gcc/6.4.0module load R/3.5.2module load python # 
> python 3 by defaultmodule load numpy/1.14.0 # requires module load 
> python

RHEL 7 is pretty old. The last PROJ 4.9.* and possibly GDAL 2.2.* may work 
OK, but anything after that first hits changes from PROJ 5, then switches 
in both PROJ, GDAL and GEOS to C++11, then the complete remake in PROJ >= 
6 with GDAL >= 3. PROJ >= 6 should not be used with GDAL < 3. The R 
versions are irrelevant, but sf/rgdal/sp versions matter.

> Settings at linux command line that previously allowed a successful 
> run:R_MAX_VSIZE=720GR_GC_MEM_GROW=0 --min-nsize=50000k --min-vsize=12M 
> --max-ppsize=500000 (when executing the R script from command line) 
> However, the modules have now been updated on the HPC, and so I am now 
> using:module load proj/6.1.1module load R/3.6.2(other modules remain the 
> same)

> I get the following error whilst processing (loading the file into R is 
> ok), with gcinfo() turned on:

Was GDAL rebuilt with the new PROJ (and GEOS)? Was rgdal re-installed with 
the new PROJ and GDAL?

> Garbage collection 144 = 86+22+36 (level 0) ... 288541.3 Mbytes of cons 
> cells used (66%)55320.1 Mbytes of vectors used (98%)Garbage collection 
> 145 = 86+23+36 (level 1) ... 66679.3 Mbytes of cons cells used 
> (15%)56447.7 Mbytes of vectors used (100%)Garbage collection 146 = 
> 86+23+37 (level 2) ... 39852.3 Mbytes of cons cells used (11%)49032.4 
> Mbytes of vectors used (72%)Garbage collection 147 = 87+23+37 (level 0) 
> ... 124935.0 Mbytes of cons cells used (36%)64961.0 Mbytes of vectors 
> used (95%)Garbage collection 148 = 87+24+37 (level 1) ... 
> 985162418403226.2 Mbytes of cons cells used (-2147483648%)35274.8 Mbytes 
> of vectors used (52%)
> Error: cons memory exhausted (limit reached?)In addition: Warning 
> Error: message:Garbage collection 149 = 88+24+37 (level 0) ... 
> Error: 985162418403226.2 Mbytes of cons cells used (-2147483648%)35274.8 
> Error: Mbytes of vectors used (52%)Lost warning messagesExecution 
> Error: haltedGarbage collection 150 = 89+24+37 (level 0) ... 
> Error: 985162418403226.2 Mbytes of cons cells used (-2147483648%)35274.8 
> Error: Mbytes of vectors used (52%)

> Error: cons memory exhausted (limit reached?)

> The job halts with Memory Utilized: 411.29 GB

> I cannot understand why the job worked previously (just) but now does 
> not when seemingly the only change is an updated proj and R version 
> (3.5.2 to 3.6.2).

R version irrelevant, PROJ should be properly aligned with GDAL (either 
old PROJ with old GDAL or PROJ >= 6 with GDAL >= 3). Don't rely on the 
server admins to understand this, if necesssary build from source (this is 
running as a single thread anyway, so the big machine is only useful for 
having a lot of memory, not an HPC issue). My guess is that the modules 
being loaded are not matched with each other.

Probably in any case this task should read the geojson into a PostGIS db 
and possibly use PostGIS or sf's database access functionality to handle 
the topological operations. I don't think that this is about memory 
management, there is plenty of RAM, but sf, PostGIS and maybe rpostgis.

Hope this helps,

Roger


> Might anyone have any suggestions as to why this is the case? And/or how 
> to alter the memory management so that memory is not exhausted so 
> easily?

> Many thanks, Chris
>
>     [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-Geo mailing list
> R-sig-Geo using r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>

-- 
Roger Bivand
Department of Economics, Norwegian School of Economics,
Helleveien 30, N-5045 Bergen, Norway.
voice: +47 55 95 93 55; e-mail: Roger.Bivand using nhh.no
https://orcid.org/0000-0003-2392-6140
https://scholar.google.no/citations?user=AWeghB0AAAAJ&hl=en
  
	[[alternative HTML version deleted]]



More information about the R-sig-Geo mailing list