[R] colClasses does not cause read.table to coerce to numeric; anymore?

Uwe Ligges ligges at statistik.tu-dortmund.de
Sat Dec 14 16:53:01 CET 2013


David,

how should R interpret "110+"? It cannot be numeric, perhaps you have 
not recognized the "+" there?

Uwe




On 14.12.2013 01:35, David Winsemius wrote:
>
> I thought that setting colClasses to numeric would coerce errant data to NA. Instead read.table is throwing
> errors. This is not what I remember from prior experience with read.table and it is not how I read the help page as promising:
>
> BE<-
> c("   1841       96           42.26        31.50        73.75 ",
> "   1841       97           29.56        20.78        50.34 ",
> "   1841       98           18.71        10.59        29.30 ",
> "   1841       99           10.48         6.23        16.71 ",
> "   1841      100            6.14         4.23        10.37 ",
> "   1841      101            3.31         2.06         5.38 ",
> "   1841      102            1.50         0.83         2.34 ",
> "   1841      103            0.33         0.05         0.38 ",
> "   1841      104            0.00         0.00         0.00 ",
> "   1841      105            0.00         0.00         0.00 ",
> "   1841      106            0.00         0.00         0.00 ",
> "   1841      107            0.00         0.00         0.00 ",
> "   1841      108            0.00         0.00         0.00 ",
> "   1841      109            0.00         0.00         0.00 ",
> "   1841      110+           0.00         0.00         0.00 ",
> "   1842        0        60290.60     62238.19    122528.79 ",
> "   1842        1        54893.31     55849.06    110742.37 ",
> "   1842        2        51991.87     53033.62    105025.49 ",
> "   1842        3        49697.90     50789.01    100486.91 ",
> "   1842        4        47598.24     48414.78     96013.02 ",
> "   1842        5        46202.38     47106.34     93308.72"
> )
> #-----------
>   BELe<-read.table(text=BE,
>                    header=FALSE, colClasses="numeric", as.is=TRUE)
> Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,  :
>    scan() expected 'a real', got '110+'
>
> I originally got this when reading from a file, but the error is from scan(). Was this an unfortunate side-effect of adding the `text` argument to read.table? It does still persist when the character string is pass through textConnection tot he file argument:
>
> BELe<-read.table(file=textConnection(BE),
>                   header=FALSE, colClasses="numeric")
> Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,  :
>    scan() expected 'a real', got '110+'
>
> My memory was that such coercion was effective in past years.
>



More information about the R-help mailing list