[R-sig-Geo] memory Usage setting

Roger Bivand Roger.Bivand at nhh.no
Thu Sep 13 12:22:59 CEST 2007


On Thu, 13 Sep 2007, Didier Leibovici wrote:

>
> Thanks Roger
>
> I feel we've got a low RAM machine which would need a bit of an uplift 
> (recent server though)!
> The linux machine is unfortunately also with 4Gb of RAM
> But  I persist to say it would be interesting to have within R a way of 
> automatically performing swapping memory if needed ...

On many OS, virtual memory will cut in at a certain point, but here the 
object is in any case too large to represent on a 32-bit system, and 
indeed R sets a limit on single object size of roughly 2Gb - for full 
details see:

?"Memory-limits"

However, the real question remains why you cannot save time by subsetting 
first, which is equivalent to swapping to virtual memory, but under your 
own control. There were some comments on this in my earlier reply, and 
other replies make similar suggestions. You still have not said why all 
the data must be in memory at the same time, and I strongly doubt that 
there are no viable or even superior alternatives.

Roger

>
> Didier
>
> Roger Bivand wrote:
>>  On Tue, 11 Sep 2007, elw at stderr.org wrote:
>> 
>> > 
>> > >  These days in GIS on may have to manipulate big datasets or arrays.
>> > > 
>> > >  Here I am on WINDOWS I have a 4Gb
>> > >  my aim was to have an array of dim 298249 12 10 22 but that's 2.9Gb
>> > 
>>
>>  Assuming double precision (no single precision in R), 5.8Gb.
>> 
>> > 
>> >  It used to be (maybe still is?) the case that a single process could 
>> >  only
>> >  'claim' a chunk of max size 2GB on Windows.
>> > 
>> > 
>> >  Also remember to compute overhead for R objects... 58 bytes per object, 
>> >  I
>> >  think it is.
>> > 
>> > 
>> > >  It is also strange that once a dd needed 300.4Mb and then 600.7Mb (?) 
>> > >  as
>> > >  also I made some room in removing ZZ?
>> > 
>> > 
>> >  Approximately double size - many things the interpreter does involve
>> >  making an additional copy of the data and then working with *that*. 
>> >  This
>> >  might be happening here, though I didn't read your code carefully enough
>> >  to be able to be certain.
>> > 
>> > 
>> > >  which I don't really know if it took into account as the limit is
>> > >  greater than the physical RAM of 4GB. ...?
>> > 
>> > :) 
>> > 
>> > >  would it be easier using Linux ?
>> > 
>> >  possibly a little bit - on a linux machine you can at least run a PAE
>> >  kernel (giving you a lot more address space to work with) and have the
>> >  ability to turn on a bit more virtual memory.
>> > 
>> >  usually with data of the size you're trying to work with, i try to find 
>> >  a
>> >  way to preprocess the data a bit more before i apply R's tools to it.
>> >  sometimes we stick it into a database (postgres) and select out the bits
>> >  we want our inferences to be sourced from.  ;)
>> > 
>> >  it might be simplest to just hunt up a machine with 8 or 16GB of memory 
>> >  in
>> >  it, and run those bits of the analysis that really need memory on that
>> >  machine...
>>
>>  Yes, if there is no other way, a 64bit machine with lots of RAM would not
>>  be so contrained, but maybe this is a matter of first deciding why doing
>>  statistics on that much data is worth the effort? It may be, but just
>>  trying to read large amounts of data into memory is perhaps not justified
>>  in itself.
>>
>>  Can you tile or subset the data, accumulating intermediate results? This
>>  is the approach the biglm package takes, and the R/GDAL interface also
>>  supports subsetting from an external file.
>>
>>  Depending on the input format of the data, you should be able to do all
>>  you need provided that you do not try to keep all the data in memory.
>>  Using a database may be a good idea, or if the data are multiple remote
>>  sensing images, subsetting and accumulating results.
>>
>>  Roger
>> 
>> > 
>> >  --e
>> > 
>>> _______________________________________________
>> >  R-sig-Geo mailing list
>> >  R-sig-Geo at stat.math.ethz.ch
>> >  https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>> > 
>> 
>
>
>

-- 
Roger Bivand
Economic Geography Section, Department of Economics, Norwegian School of
Economics and Business Administration, Helleveien 30, N-5045 Bergen,
Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: Roger.Bivand at nhh.no




More information about the R-sig-Geo mailing list