[Rd] A couple of issues with colClasses/setAs
Peter Dalgaard
p.dalgaard at biostat.ku.dk
Wed Sep 8 00:34:23 CEST 2004
Consider this:
$ cat test.dat
1 a
2 b
Now, we want to read the 2nd column as a factor and ignore the first
(since it's just a sequential ID). We can't just put "factor" among
the colClasses (would have been nice), so let's try this instead
> setAs("character","factor",as.factor)
Arguments in definition changed from (x) to (from)
> read.table("test.dat",colClasses=c("numeric","factor"))
Error in inherits(x, "factor") : Object "x" not found
which is a bit peculiar: Why does it change the argument when that's
going to create a function that doesn't work?? You do need to spell it
out:
> setAs("character","factor",function(from)as.factor(from))
And now we get somewhere
> read.table("test.dat",colClasses=c("numeric","factor"))
V1 V2
1 1 a
2 2 b
but suppose we want to get rid of col.1:
> read.table("test.dat",colClasses=c("NULL","factor"))
Error in data[[i]] : subscript out of bounds
which looks like a pretty clear bug. In contrast, this works fine
> read.table("test.dat",colClasses=c("NULL","character"))
V2
1 a
2 b
so the issue only arises when you have nontrivial coercions.
Presumably, the issue is that the colClasses in those cases
miscalculate indices by forgetting the columns that were skipped.
--
O__ ---- Peter Dalgaard Blegdamsvej 3
c/ /'_ --- Dept. of Biostatistics 2200 Cph. N
(*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
More information about the R-devel
mailing list