[R] split data frame temporary and work with only part of it?

Daniel Malter daniel at umd.edu
Sun Jul 24 21:55:10 CEST 2011


My recommendation would be to not "subset out" the data, because you are
introducing a potential source of error when binding the new data back
together with the old data. Preferably, I would work on selecting subsets of
the dataset using indices (as suggested in the previous post) and just do
the computations for these subsets without separating the datasets.
Alternatively, you can split() the data, do your computations, and later
unsplit() the data.

HTH,
Daniel


ivo welch wrote:
> 
> dear R wizards:  I have a large data frame, a million rows, 40
> columns.  In this data frame, there are some (about 100,000) rows
> which I want to recompute (update), while I want to leave others just
> as is.  this is based on a condition that I need to compute, based on
> what is in a few of the columns.  what is the right R way to do this?
> 
> I could subset out the rows that I want to recompute into a new data
> frame (A), subset out the rows I don't want to recompute (B), operate
> on the first data frame (A), then rbind the two (A and B) back
> together and resort into original order.  is this the recommended way?
> 
> sincerely,
> 
> /iaw
> ----
> Ivo Welch (ivo.welch at gmail.com)
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 

--
View this message in context: http://r.789695.n4.nabble.com/split-data-frame-temporary-and-work-with-only-part-of-it-tp3690576p3690818.html
Sent from the R help mailing list archive at Nabble.com.



More information about the R-help mailing list