[R] question
Peter Dalgaard
p.dalgaard at biostat.ku.dk
Wed Nov 19 12:29:19 CET 2003
Philippe Glaziou <glaziou at pasteur-kh.org> writes:
> Fuensanta Saura Igual <igual at maFuensanta Saura Igual <igual at mat.uji.es> wrote:
> > Does anyone know how I can read from a .txt file the lines that
> > are between two strings whose location is unknown?
> >
> > My problem is that I have a .txt file with data separated by a
> > sentence, for example:
> >
> > 2.22 3.45
> > 1.56 2.31
> > pattern 1
> > 4.67 7.91
> > 3.34 2.15
> > 5.32 3.88
> > pattern 2
> > ...
> >
> > I do not know the number of lines where these separating
> > sentences are located, because the number of lines in between
> > them can be random. If it was fixed, I think I could use
> > "read.table" using the option "skip", but in this case, I do
> > not know how I could manage to do that automatically.
>
>
> This is a job for sed. The following command will delete any line
> not starting with a digit from "file.txt" and save the results in
> "file2.txt":
>
> cat file.txt | sed -e '/^$\|^[^0-9]/D' > file2.txt
Er, no, that wasn't the requirement. It's a job for awk or perl, e.g.
#!/usr/bin/perl -n
if (/pattern 1/){
$copy = 1;
next;
}
if (/pattern 2/){
$copy = 0;
}
print if $copy;
or
awk '/pattern 1/{copy=1;next};/pattern 2/{copy=0};copy==1' < file.txt > file2.txt
--
O__ ---- Peter Dalgaard Blegdamsvej 3
c/ /'_ --- Dept. of Biostatistics 2200 Cph. N
(*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
More information about the R-help
mailing list