[Rd] A couple of issues with colClasses/setAs
Prof Brian Ripley
ripley at stats.ox.ac.uk
Wed Sep 8 11:31:53 CEST 2004
>From ?read.table (this is about read.table, despite the subject line, I
believe?)
colClasses: character. A vector of classes to be assumed for the columns.
"NULL" is not a class in my book (and certainly not one a column can
have). So no wonder it does not work, and it is not a bug not to work in
undocumented cases.
We can look into making it work, but once you start skipping columns I
think you should be using scan(). (I also suspect scan did not accept
NULL when this was implemented.)
On 8 Sep 2004, Peter Dalgaard wrote:
> Consider this:
>
> $ cat test.dat
> 1 a
> 2 b
>
> Now, we want to read the 2nd column as a factor and ignore the first
> (since it's just a sequential ID).
Well, you have to have row names, so that's not actually an advantage.
> We can't just put "factor" among
> the colClasses (would have been nice), so let's try this instead
>
> > setAs("character","factor",as.factor)
> Arguments in definition changed from (x) to (from)
> > read.table("test.dat",colClasses=c("numeric","factor"))
> Error in inherits(x, "factor") : Object "x" not found
>
> which is a bit peculiar: Why does it change the argument when that's
> going to create a function that doesn't work?? You do need to spell it
> out:
>
> > setAs("character","factor",function(from)as.factor(from))
> And now we get somewhere
>
> > read.table("test.dat",colClasses=c("numeric","factor"))
> V1 V2
> 1 1 a
> 2 2 b
Might be a good idea to teach colClasses about "factor".
>
> but suppose we want to get rid of col.1:
>
> > read.table("test.dat",colClasses=c("NULL","factor"))
> Error in data[[i]] : subscript out of bounds
>
> which looks like a pretty clear bug. In contrast, this works fine
>
> > read.table("test.dat",colClasses=c("NULL","character"))
> V2
> 1 a
> 2 b
>
> so the issue only arises when you have nontrivial coercions.
>
> Presumably, the issue is that the colClasses in those cases
> miscalculate indices by forgetting the columns that were skipped.
>
>
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
More information about the R-devel
mailing list