[R] question (and a suggestion)

Gabor Grothendieck ggrothendieck at myway.com
Wed Nov 19 16:45:05 CET 2003

This question repeatedly comes up, e.g. see


from less than two weeks ago.  

One thing that occurred to me is that the answer could be simplified
if skip= could take a 2-vector argument such that:

read.table("x.dat", skip=grep("start|end",readLines("x.dat")), head=T)

reads in the data.  We already have nrows= but it deals with logical
lines whereas skip= deals with physical lines making it harder to 
handle, particularly when blank lines are thrown away.

An even more powerful facility would be to allow the elements of
the 2-vector to be either as above or regular expressions.  In
the latter case, one could simply write:

  read.table("x.dat", skip=c("start","end"), head=T)

which has the added benefit of not reading the data twice.  Of
course, there are numerous other possibilities for embedding data 
in a text file that this does not handle but, based on the postings 
to the list, this does seem to be the common one not already 
easily handled by read.table.

Date: Wed, 19 Nov 2003 10:32:39 +0100 
From: Fuensanta Saura Igual <igual at mat.uji.es>
To: <R-help at stat.math.ethz.ch> 
Subject: [R] question 


Does anyone know how I can read from a .txt file the lines that are between
two strings whose location is unknown?

My problem is that I have a .txt file with data separated by a sentence,
for example:

2.22 3.45
1.56 2.31
pattern 1
4.67 7.91
3.34 2.15
5.32 3.88
pattern 2

I do not know the number of lines where these separating sentences are located,
because the number of lines in between them can be random. If it was fixed, I 
think I could use "read.table" using the option "skip", but in this case, I do 
not know how I could manage to do that automatically.


R-help at stat.math.ethz.ch mailing list

More information about the R-help mailing list