[R] Question: how to index (subset) a data frame without memory overhead

Piotr Chmielowski piotr.chmielowski at reechaim.com
Fri Aug 28 16:35:43 CEST 2009


1. Suppose one has a big data frame (say, m such that dim(m)=c(8610, 3521) )
If only a subset of m, say m[1:8600, ] is now needed, how to select it without creating large memory overhead? A natural solution, m <- m[1:8600,], seems to use in addition to memory needed to hold m roughly 2 times more memory - making the total memory required over 3 times object.size(m), as seen by using memory.size(max=T). This is understandable since the arguments are passed as value. However, is there a natural way around this memory overhead?

2. Similarly, if one has another data frame n such that dim(n)=c(10, 3521), doing m <- rbind(m,n) also needs the same amount of memory overhead. Is there any way around that?

I tried package ref, but could not solve the particular problems above.

Any help would be appreciated.

Piotr

Piotr Chmielowski
Chief Operating Officer/Group Risk Manager

Kingsley House, Wimpole Street, London, W1G 0RE, United Kingdom
http://www.reechaim.com/

 DDI: +44 (0)20 7399 3662

 Switchboard: +44 (0)20 7399 3650

 Fax: +44 (0)20 7399 3698

 Mobile: +44 (0)7825 711 957

Email: piotr.chmielowski at reechaim.com
http://www.reechaim.com/

Reech AiM Partners LLP, Registered Office: 42-44 Portman Road, Reading, Berkshire, RG30 1EA. Registered in England and Wales No. OC321436. Authorised and regulated by the Financial Services Authority.
Reech CBRE Alternative Real Estate LLP, Registered Office: 42-44 Portman Road, Reading, Berkshire, RG30 1EA. Registered in England and Wales No. OC322313. Authorised and regulated by the Financial Services Authority.
This message and any attachments (the "message") is intended solely for the addressees and is confidential. If you receive this message in error, please delete it and immediately notify the sender. Any use not in accord with its purpose, any dissemination or disclosure, either whole or partial, is prohibited except by formal approval. The internet can not guarantee the integrity of this message.  We shall not therefore be liable for the message if modified.




______________________________________________________________________
This email has been scanned by the MessageLabs Email Security System.
For more information please visit http://www.messagelabs.com/email 




More information about the R-help mailing list