[R] Tools for data preparation?
(Ted Harding)
Ted.Harding at nessie.mcc.ac.uk
Fri Nov 19 09:56:47 CET 2004
On 19-Nov-04 David Mitchell wrote:
> Hello list,
>
> I'm regularly in the position where I have to do a lot of data
> manipulation, in order to get the data I have into a format R
> is happy with. This manipulation would generally be in one of
> two forms:
> - getting data from e.g. text log files into a tabular format
> - extracting sensible sample data from a very large data set
> (i.e. too large for R to handle)
>
> In general, I use Perl or Python to do the task; I'm curious
> as to what others use when they hit the same problem.
I generally use 'awk' with help from 'sed' when needed.
This is on the same lines as your choice though lighter-weight
and less powerful (but I've never had a case that needed more).
Since the sort of task you describe is basically on a line-by-line
basis (and what's meant by a "line" can be pretty flexible in 'awk'),
this sort of thing can be done straightforwardly; but greater
flexibility is also possible.
E.g. it is easy to extract a line from the input, or apply a certain
transformation to fields in a line, if & only if it has already been
preceded by a line satisfying a certain condition, and so on.
Best wishes,
Ted.
--------------------------------------------------------------------
E-Mail: (Ted Harding) <Ted.Harding at nessie.mcc.ac.uk>
Fax-to-email: +44 (0)870 094 0861 [NB: New number!]
Date: 19-Nov-04 Time: 08:56:47
------------------------------ XFMail ------------------------------
More information about the R-help
mailing list