[Rd] Reducing RAM usage using UKSM

Wed Jul 16 15:51:25 CEST 2014

On 16/07/2014 14:07, Gregory R. Warnes wrote:
> Hi Varadharajan,
>
> Linux uses a copy-on-write for the memory image of forked processes.

But Linux copied it from Unix and I see no mention of Linux in the 
posting being replied to.

> Thus, you may also get significant memory savings by launching a single R process, loading your large data object, and then using fork::fork() to split off the other worker process.

Or using the more refined interface in package 'parallel' (which is 
portable, unlike package 'fork': see its CRAN check results).

> -Greg
>
> Sent from my iPad
>
>> On Jul 16, 2014, at 5:07 AM, Varadharajan Mukundan <srinathsmn at gmail.com> wrote:
>>
>> [Sending it again in plain text mode]
>>
>> Greetings,
>>
>> We've a fairly large dataset (around 60GB) to be loaded and crunched
>> in real time. The kind of data operations that will be performed on
>> this data are simple read only aggregates after filtering the
>> data.table instance based on the parameters that will passed in real
>> time. We need to have more than one instance of such R process running
>> to serve different testing environments (each testing environment has
>> fairly identical dataset but do have a *small amount of changes*). As
>> we all know, data.table loads the entire dataset into memory for
>> processing and hence we are facing a constraint on number of such
>> process that we could run on the machine. On a 128GB RAM machine, we
>> are coming up with ways in which we could reduce the memory footprint
>> so that we can try to spawn more instances and use the resources
>> efficiently. One of the approaches we tried out was memory
>> de-duplication using UKSM
>> (http://kerneldedup.org/en/projects/uksm/introduction), given that we
>> did have few idle cpu cores. Outcome of the experiment was quite
>> impressive, considering that the effort to set it up was quite less
>> and the entire approach considers the application layer as a black
>> box.
>>
>> Quick snapshot of the results:
>> 1 Instance (without UKSM): ~60GB RAM was being used
>> 1 Instance (with UKSM): ~53 GB RAM was being used
>>
>> 2 Instance (without UKSM): ~125GB RAM was being used
>> 2 Instance (with UKSM): ~81 GB RAM was being used
>>
>> We can see that around 44 GB of RAM was saved after UKSM merged
>> similar pages and all this for  a compromise of 1 CPU core on a 48
>> core machine. We did not feel any noticeable degradation of
>> performance because the data is refreshed by a batch job only once
>> (every morning); UKSM gets in at this time and performs the same page
>> merging and for the rest of day, its just read only analysis. The kind
>> of queries we fire on the dataset at most scans 2-3GB of the entire
>> dataset and hence the query subset spike was low as well.
>>
>> We're interested in knowing if this is a plausible solution to this
>> problem? Any other points/solutions that we should be considering?

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595