[R] R 3.5.0, vector memory exhausted error on readBin
iuke-tier@ey m@iii@g oii uiow@@edu
iuke-tier@ey m@iii@g oii uiow@@edu
Tue Jun 12 16:14:07 CEST 2018
The environment variable R_MAX_VSIZE is read at start-up so need to be
set outside R. If you are starting R from a shell you can use
env R_MAX_VSIZE=700Gb R
If you use a GUI you might need to set the variable in another
way.
Here is a reproducible version of your example:
hertz <- 6000
binfile <- tempfile()
writeBin(1L, binfile, size = 2)
v <- readBin(binfile, integer(), size = 2, n = 8*hertz*60*60000)
unlink(binfile)
With the limit raised to 700Gb or more this will work in R 3.5.0
but you lose the protection of the lower default setting. You need a
value that high because your 'n' value is asking readBin to allocate a
buffer 643.7 Gb. Mac OS lets you allocate this much address space, as
long as you don't try to use all of it (this is memory
overcommitment). Running this example on a Linux system with 128Gb of
memory produces
Error: cannot allocate vector of size 643.7 Gb
I suspect this will fail on pretty much any Windows system as well.
My recommendation would be to figure out a lower upper bound on the
number of elements to read, maybe using file.size, and use that for 'n'
in your readBin call. That will allow your code to be more portable
and avoid the risks of removing the allocation protection.
Best,
luke
On Tue, 12 Jun 2018, Valerie Cavett wrote:
> Thanks so much for taking a look at this.
>
> Before setting a new value, I opened a fresh session of R and checked to see whether there was any value set for R_MAX_VSIZE. There was not, so we'll assume the default as you described.
>
> Next, I tried to set a value with
> Sys.setenv("R_MAX_VSIZE" = 8e9)
>
>
> When the system environment is checked again, there is now a value of
?? R_MAX_SIZE 8e+09
>
>
> Unfortunately, when I try to read in a small binary file, I still encounter the same error.
??
>
> I restored R 3.3 and checked the system environment to confirm that there was no R_MAX_SIZE configured in the startup file, then tested readBin as follows:
>
>
> hertz <- 6000
> bin.read = file("20180611_A4", "rb")
> datavals = readBin(bin.read, integer(), size = 2, n = 8*hertz*60*60000, endian = "little")
>
>
> datavals is a large integer with 6046880 elements, 23.1 Mb.
>
>
> If I then set the R_MAX_SIZE to 8e9, this also works just fine since the file is not really that large.
>
>
> However, if I switch back to the newest R version (3.5.0), I encounter the same error:
>
>
> > datavals = readBin(bin.read, integer(), size = 2, n = 8*hertz*60*60000, endian = "little")
> Error: vector memory exhausted (limit reached?)
>
>
> I’m at a loss for why this is an issue (same machine) in R 3.5.0, but not in 3.3.2 or 3.4.4. If you have any further suggestions, I’d greatly appreciate them.
>
>
> From: luke-tierney using uiowa.edu <luke-tierney using uiowa.edu>
> Sent: Tuesday, June 12, 2018 5:26:37 AM
> To: Valerie Cavett
> Cc: r-help using R-project.org
> Subject: Re: [R] R 3.5.0, vector memory exhausted error on readBin
>
??
> This item in NEWS explains the change:
>
> • The environment variable R_MAX_VSIZE can now be used to specify
> the maximal vector heap size. On macOS, unless specified by this
> environment variable, the maximal vector heap size is set to the
> maximum of 16GB and the available physical memory. This is to
> avoid having the R process killed when macOS over-commits memory.
>
> You can set R_MAX_VSIZE to a larger value but you should do some
> experimenting to decide on a safe value for your system. Mac OS is
> quite good at using virtual memory up to a point but then gets very
> bad. For my 4 GB mac numeric(8e9) works but numeric(9e9) causes R to
> be killed, so a setting of around 60GB _might_ be safe.
>
> File size probably doesn't matter in your example since you are
> setting a large value for n - I can't tell how large since you didn't
> provide your value of 'hertz'.
>
> Best,
>
> luke
>
> On Mon, 11 Jun 2018, Valerie Cavett wrote:
>
>> I???ve been reading in binary data collected via LabView for a project, and after upgrading to R 3.5.0, the code returns an error indicating that the 'vector memory is exhausted???. I???m happy to provide a sample binary file; even ones that are quite small (12 MB) generate this error. (I wasn???t sure whether a binary file attached to this email would trigger a spam filter.)
>>
>> bin.read = file(files[i], "rb???)
>> datavals = readBin(bin.read, integer(), size = 2, n = 8*hertz*60*60000, endian = "little???)
>>
>> Error: vector memory exhausted (limit reached?)
>>
>>
>> sessionInfo()
>> R version 3.5.0 (2018-04-23)
>> Platform: x86_64-apple-darwin15.6.0 (64-bit)
>> Running under: macOS Sierra 10.12.6
>>
>>
>> This does not happen in R 3.4 (R version 3.4.4 (2018-03-15) -- "Someone to Lean On???) - the vector is created and populated by the binary file values without issue, even at a 1GB binary file size.
>>
>> Other files that are read in as csv files, even at 1GB, load correctly to 3.5, so I assume that this is a function of a vector being explicitly defined/changed in some way from 3.4 to 3.5.
>>
>> Any help, suggestions or workarounds are greatly appreciated!
>> Val
>>
>> [[alternative HTML version deleted]]
>>
>>
>
>
--
Luke Tierney
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa Phone: 319-335-3386
Department of Statistics and Fax: 319-335-3017
Actuarial Science
241 Schaeffer Hall email: luke-tierney using uiowa.edu
Iowa City, IA 52242 WWW: http://www.stat.uiowa.edu
More information about the R-help
mailing list