[Rd] Problem in scan() (PR#4128)

Prof Brian Ripley ripley at stats.ox.ac.uk
Thu Sep 11 22:41:35 MEST 2003


Quotes are only interpreted in character columns (scan.c line 240), and
NULL is not character.  So this was intentional.

If you would like this changed, please supply a patch (which looks to be  
a good exercise).

On Thu, 11 Sep 2003 Paul.Bayer at gleichsam.de wrote:

> Full_Name: Paul Bayer
> Version: 1.7.1
> OS: Windows + Linux
> Submission from: (NULL) (217.235.105.54)
> 
> 
> I tried to read some large csv-files into R (30 - 100MB).
> with scan(), skipping not needed columns by NULL-elements in
> "what".
> 
> When these skipped elements are quoted strings with commas inside,
> R interprets each such quoted comma as element separator
> leading to wrong records in the rest of the line.
> 
> A little test will show what I mean. I have the following "test.csv":
> 
> "col.A","col.B","col.C","col.D"
> 1,"quoted string","again, again again",123
> 2,"nice quotes, isnt it","you got it",456
> 
> First I read all elements:
> 
> > tst <- scan("test.csv", what=list(a=0,b="",c="",d=0), sep=",", skip=1)
> Read 2 records
> > tst
> $a
> [1] 1 2
> 
> $b
> [1] "quoted string"        "nice quotes, isnt it"
> 
> $c
> [1] "again, again again" "you got it"
> 
> $d
> [1] 123 456
> 
> Everything is fine. Then I try to skip the 2nd column by giving b=NULL:
> 
> > tst <- scan("test.csv", what=list(a=0,b=NULL,c="",d=0), sep=",", skip=1)
> Read 2 records
> Warning message:
> number of items read is not a multiple of the number of columns
> > tst
> $a
> [1] 1 2
> 
> $b
> NULL
> 
> $c
> [1] "again, again again"            " isnt it,you got it,456\n\n\n"
> 
> $d
> [1] 123  NA
> 
> >
> 
> I got garbage.
> 
> ______________________________________________
> R-devel at stat.math.ethz.ch mailing list
> https://www.stat.math.ethz.ch/mailman/listinfo/r-devel
> 
> 

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595



More information about the R-devel mailing list