[Rd] A couple of issues with colClasses/setAs

Prof Brian Ripley ripley at stats.ox.ac.uk
Wed Sep 8 11:31:53 CEST 2004

>From ?read.table (this is about read.table, despite the subject line, I 

colClasses: character.  A vector of classes to be assumed for the columns.

"NULL" is not a class in my book (and certainly not one a column can
have).  So no wonder it does not work, and it is not a bug not to work in
undocumented cases.

We can look into making it work, but once you start skipping columns I 
think you should be using scan().  (I also suspect scan did not accept 
NULL when this was implemented.)

On 8 Sep 2004, Peter Dalgaard wrote:

> Consider this:
> $ cat test.dat
> 1 a
> 2 b
> Now, we want to read the 2nd column as a factor and ignore the first
> (since it's just a sequential ID). 

Well, you have to have row names, so that's not actually an advantage.

> We can't just put "factor" among
> the colClasses (would have been nice), so let's try this instead
> > setAs("character","factor",as.factor)
> Arguments in definition changed from (x) to (from)
> > read.table("test.dat",colClasses=c("numeric","factor"))
> Error in inherits(x, "factor") : Object "x" not found
> which is a bit peculiar: Why does it change the argument when that's
> going to create a function that doesn't work?? You do need to spell it
> out:
> > setAs("character","factor",function(from)as.factor(from))

> And now we get somewhere
> > read.table("test.dat",colClasses=c("numeric","factor"))
>   V1 V2
> 1  1  a
> 2  2  b

Might be a good idea to teach colClasses about "factor".

> but suppose we want to get rid of col.1:
> > read.table("test.dat",colClasses=c("NULL","factor"))
> Error in data[[i]] : subscript out of bounds
> which looks like a pretty clear bug. In contrast, this works fine
> > read.table("test.dat",colClasses=c("NULL","character"))
>   V2
> 1  a
> 2  b
> so the issue only arises when you have nontrivial coercions.
> Presumably, the issue is that the colClasses in those cases
> miscalculate indices by forgetting the columns that were skipped.

Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

More information about the R-devel mailing list