[R] disabling NA token as na.string in read.table
Peter Dalgaard BSA
p.dalgaard at biostat.ku.dk
Thu Dec 19 23:17:03 CET 2002
Vadim Ogranovich <vograno at arbitrade.com> writes:
> Dear R-Users,
>
> I have a csv file that has NA tokens and these tokens are perfectly good
> values that need not to be converted to NA by read.table(). I tried to
> prevent the conversion by specifying the na.strings arg., but this seems to
> only add to the list of NA strings, not substitute.
>
> > system("cat foo")
> system("cat foo")
> 1 foo
> 2 NA
> > read.table("foo", na.strings="foo")
> read.table("foo", na.strings="foo")
> V1 V2
> 1 1 NA
> 2 2 NA
>
>
> This is R1.6.0 on Linux.
>
> What did I do wrong?
Hmm, this looks like a bit of a bug. read.table() ends up calling
type.convert() with its default "NA" na.string. Now, if "NA" was in
the na.string for read.table(), scan() would already have turned it
into <NA> at that point, so I suspect you might have preferred
na.strings=character(0), but that has the side effect of turning the
real NA into a factor level:
> x <- c(NA,"NA","foo")
> type.convert(x)
[1] <NA> <NA> foo
Levels: foo
> type.convert(x,na.strings=character(0))
[1] <NA> NA foo
Levels: NA foo NA
> dput(type.convert(x,na.strings=character(0)))
structure(c(3, 1, 2), .Label = c("NA", "foo", NA), class = "factor")
I.e. it looks like the internals of type.convert needs some fixing up.
--
O__ ---- Peter Dalgaard Blegdamsvej 3
c/ /'_ --- Dept. of Biostatistics 2200 Cph. N
(*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
More information about the R-help
mailing list