[R] adding rows...

Adeel - SafeGreenCapital adeel.amin at gmail.com
Thu May 23 15:05:37 CEST 2013


Hi Rainer:

Thanks for the reply.  Posting the large dataset is a task.  There are 8M
rows between the two of them and the first discrepancy in the data doesn't
happen until at least the 40,000th row on each dataframe.  The examples I
posted are a pretty good abstraction of the root of the issue.  

The problem isn't the data.  The problem is Out Of Memory issues when doing
any operations like merge, rbind, etc.  The solution that Blaser suggested
in his post works great, but the systems quickly run out of memory.  What
does work without OOM issues are for/while loops but on average take an
inordinate time to compute and tie up a machine for hours and hours at time.
Essentially I break the data apart, add rows and rebind.  It's a brute force
type of approach and run times are in excess of 48 hours for one full
iteration across 25 data frames.  Terrible.

I am about to go down the road of using data.tables class as its far more
memory efficient, but the documentation is cryptic. Your idea of creating a
super set has some merit and it's what I was experimenting with prior to my
original post.  

-----Original Message-----
From: Rainer Schuermann [mailto:rainer.schuermann at gmx.net] 
Sent: Thursday, May 23, 2013 12:19 AM
To: Adeel Amin
Subject: adding rows...

Can I suggest that you post the output of

dput( DF1 )
dput( DF2 )

rather than "pictures" of your data? Any solution attempt will depend upon
the data types...

Just shooting in the dark: Have you tried just row-binding the missing 4k
lines to DF1 and then order DF1 as you like? It looks as if the data are
ordered by time / date? 

Rgds,
Rainer



More information about the R-help mailing list