[R] parsing a data file

Barry Rowlingson B.Rowlingson at lancaster.ac.uk
Tue Apr 27 12:35:59 CEST 2004


Tamas Papp wrote:

> I need to parse a data file (output of a measuring device) of the
> following format:
> 
> BEGIN RECORD [first record data] RECORD [second
> record data] RECORD
> [third record data]
> END

  Is it just the one 'BEGIN/END' pair per file? Or are there several? 
What's the format of the [first record data] entries? Numbers, strings? 
Are there literally square brackets in there?

> I need to extract the record data I marked with []'s, eg a vector such
> as c("[first record data]", "[second]", ...) would be nice as a
> result.
> 
> What functions should I use for this?

  I'd consider writing a Perl script that converted this into an XML 
file, then you could probably use the RXML package to read it, and it 
would be in a format readable by any XML-reading thing, or at least in a 
more easily-convertable form. But that might be a bit heavyweight, and 
the Ted Harding approach of sed, tr, and awk is always appealing, 
assuming you have a Unix box or a Unix box-of-tricks on Windows (cygwin).


Baz




More information about the R-help mailing list