[R] data import problem

Sean Davis sdavis2 at mail.nih.gov
Wed Mar 8 12:55:37 CET 2006




On 3/8/06 6:43 AM, "Philipp Pagel" <philipp.pagel.lists at t-online.de> wrote:

> On Wed, Mar 08, 2006 at 12:32:28PM +0100, Arne.Muller at sanofi-aventis.com
> wrote:
>> I'm trying to read a text data file that contains several records
>> separated by a blank line. Each record starts with a row that contains
>> it's ID and the number of rows for the records (two columns), then the
>> data table itself, e.g.
>> 
>> 123 5
>> 89.1791    1.1024
>> 90.5735    1.1024
>> 92.5666    1.1024
>> 95.0725    1.1024
>> 101.2070    1.1024
>> 
>> 321 3
>> 60.1601    1.1024
>> 64.8023    1.1024
>> 70.0593    2.1502
> 
> That sound like a job for awk. I think it will be much easier to
> transform the data into a flat table using awk, python or perl an then
> just read the table with R.

If you want to use R, you can use a simple combination of
readLines(con,n=1), strsplit on tabs, and simple if statements (to find
blank lines and start new records) to do parse these types of files.  I
thought this would be slow, but I do it in one of my own packages and find
that it is pretty fast.

Sean




More information about the R-help mailing list