[Rd] More on scan: extra field at end of line

Prof Brian Ripley ripley@stats.ox.ac.uk
Tue, 26 Dec 2000 17:34:30 +0000 (GMT)


On Tue, 26 Dec 2000, Yves Gauvreau wrote:

> Hi,
> 
> I see that Prof Ripley propose to pre-process the file using sed. I saw that
> to do so he used "pipe". I look for it on my system (see below) and the
> function doesn't seem to be available. Since I have sed from cygwin32 I
> wonder if there would a way to use it in a similar fashion as proposed here?

1) Yes, in 1.2.0.  I would encourage people to at least try 1.2.0,
not least as 1.2.1 is due out pretty soon and we would like to get the 
maximal number of bugs zapped.  (The PATH problem in rwinst.exe has been
solved in the version now up on CRAN.)

2) On Windows, you will need to do it in rterm: pipe does not work in
Rgui.  That's an OS deficiency that I hope to be able to work around in
time for 1.2.1, but I knew Peter Kleiweg was on HP-UX/Linux.

I suppose in part I was pointing out how neatly some of the pieces we now
have fit together.

> 
> Thanks
> 
> YG
> 
> platform Windows
> arch     x86
> os       Win32
> system   x86, Win32
> status
> major    1
> minor    1.1
> year     2000
> month    August
> day      15
> language R
> 
> > -----Message d'origine-----
> > De : owner-r-devel@stat.math.ethz.ch
> > [mailto:owner-r-devel@stat.math.ethz.ch]De la part de Prof Brian Ripley
> > Envoye : Tuesday, December 26, 2000 9:54 AM
> > A : Peter Kleiweg
> > Cc : r-devel@stat.math.ethz.ch
> > Objet : Re: [Rd] More on scan: extra field at end of line
> >
> >
> > On Tue, 26 Dec 2000, Peter Kleiweg wrote:
> >
> > >
> > > Suppose, I have a file "data1" containing:
> > >
> > >     450   390   467   654    30   542   334   432   421
> > >     357   497   493   550   549   467   575   578   342
> > >     446   547   534   495   979   479
> > >
> > > I can read this file with:
> > >
> > >     scan("data1")
> > >     Read 24 items
> > >      [1] 450 390 467 654  30 542 334 432 421 357 497 493 550
> > 549 467 575 578 342 446
> > >      [20] 547 534 495 979 479
> > >
> > > But now, suppose I have a file "data2" containing:
> > >
> > >     450, 390, 467, 654,  30, 542, 334, 432, 421,
> > >     357, 497, 493, 550, 549, 467, 575, 578, 342,
> > >     446, 547, 534, 495, 979, 479
> > >
> > > When I try to read this with sep="," I get:
> > >
> > >     scan("data2", sep=",")
> > >     Read 26 items
> > >      [1] 450 390 467 654  30 542 334 432 421  NA 357 497 493
> > 550 549 467 575 578 342
> > >      [20]  NA 446 547 534 495 979 479
> > >
> > > I get two extra fields, both NA. Not what I'd want. And I can't
> > > drop the NA's, because there could be other NA's, not resulting
> > > from this comma-EOL combination.
> >
> > You can easily remove the trailing commas, though, as in
> >
> > scan(pipe("sed -e s/,$// data2"), sep=",")
> > Read 24 items
> >  [1] 450 390 467 654  30 542 334 432 421 357 497 493 550 549 467 575 578
> > 342 446
> > [20] 547 534 495 979 479
> >
> >
> > > I suggest, the proper action for scan would be to treat the
> > > combination sep plus newline as a single separator.
> >
> > However, that's not compatible with S or earlier versions of R or
> > the documentation
> >
> >      sep: by default, scan expects to read white-space delimited input
> >           fields.  Alternatively, `sep' can be used to specify a
> >           character which delimits fields.  A field is always delimited
> >           by a newline unless it is quoted.
> >
> > I suggest the proper action is to act as documented!
> >
> > --
> > Brian D. Ripley,                  ripley@stats.ox.ac.uk
> > Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
> > University of Oxford,             Tel:  +44 1865 272861 (self)
> > 1 South Parks Road,                     +44 1865 272860 (secr)
> > Oxford OX1 3TG, UK                Fax:  +44 1865 272595
> >
> > -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.
> > -.-.-.-.-.-.-
> > r-devel mailing list -- Read
> > http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
> > Send "info", "help", or "[un]subscribe"
> > (in the "body", not the subject !)  To: r-devel-request@stat.math.ethz.ch
> > _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._.
> > _._._._._._._
> >
> 
> 

-- 
Brian D. Ripley,                  ripley@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272860 (secr)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-devel-request@stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._