[Rd] More on scan: extra field at end of line

Prof Brian Ripley ripley@stats.ox.ac.uk
Tue, 26 Dec 2000 14:54:29 +0000 (GMT)


On Tue, 26 Dec 2000, Peter Kleiweg wrote:

> 
> Suppose, I have a file "data1" containing:
> 
>     450   390   467   654    30   542   334   432   421
>     357   497   493   550   549   467   575   578   342
>     446   547   534   495   979   479
> 
> I can read this file with:
> 
>     scan("data1")
>     Read 24 items
>      [1] 450 390 467 654  30 542 334 432 421 357 497 493 550 549 467 575 578 342 446
>      [20] 547 534 495 979 479    
> 
> But now, suppose I have a file "data2" containing:
> 
>     450, 390, 467, 654,  30, 542, 334, 432, 421,
>     357, 497, 493, 550, 549, 467, 575, 578, 342,
>     446, 547, 534, 495, 979, 479
> 
> When I try to read this with sep="," I get:
> 
>     scan("data2", sep=",")
>     Read 26 items
>      [1] 450 390 467 654  30 542 334 432 421  NA 357 497 493 550 549 467 575 578 342
>      [20]  NA 446 547 534 495 979 479 
> 
> I get two extra fields, both NA. Not what I'd want. And I can't
> drop the NA's, because there could be other NA's, not resulting
> from this comma-EOL combination.

You can easily remove the trailing commas, though, as in

scan(pipe("sed -e s/,$// data2"), sep=",")
Read 24 items
 [1] 450 390 467 654  30 542 334 432 421 357 497 493 550 549 467 575 578
342 446
[20] 547 534 495 979 479


> I suggest, the proper action for scan would be to treat the
> combination sep plus newline as a single separator.

However, that's not compatible with S or earlier versions of R or
the documentation 

     sep: by default, scan expects to read white-space delimited input
          fields.  Alternatively, `sep' can be used to specify a
          character which delimits fields.  A field is always delimited
          by a newline unless it is quoted.

I suggest the proper action is to act as documented!

-- 
Brian D. Ripley,                  ripley@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272860 (secr)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-devel-request@stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._