[R] read.table(..., header == FALSE, colClasses = <vector with names attribute>)

Martin Maechler maechler at stat.math.ethz.ch
Tue Oct 24 14:55:19 CEST 2017

>>>>> Benjamin Tyner <btyner at gmail.com>
>>>>>     on Tue, 24 Oct 2017 07:21:33 -0400 writes:

    > Jeff,
    > Thank you for your reply. The intent was to construct a minimum 
    > reproducible example. The same warning occurs when the 'file' argument 
    > points to a file on disk with a million lines. But you are correct, my 
    > example was slightly malformed and in fact gives an error under R 
    > version 3.2.2. Please allow me to try again; in older versions of R,

    >    > read.table(file = textConnection("a\t3.14"), header = FALSE, 
    > colClasses = c(x = "character", y = "numeric"), sep="\t")
    >      V1   V2
    >    1  a 3.14

    > (with no warning). As of version 3.3.0,

    >    > read.table(file = textConnection("a\t3.14"), header = FALSE, 
    > colClasses = c(x = "character", y = "numeric"), sep="\t")
    >      V1   V2
    >    1  a 3.14
    >    Warning message:
    >    In read.table(file = textConnection("a\t3.14"), header = FALSE,  :
    >      not all columns named in 'colClasses' exist

    > My intent was not to complain but rather to learn more about best 
    > practices regarding the names attribute.

which is a nice attitude, thank you.

An even shorter MRE (as header=FALSE is default, and the default
sep="" works, too):

> tt <- read.table(textConnection("a 3.14"), colClasses = c(x="character", y="numeric"))
Warning message:
In read.table(file = textConnection("a 3.14"), colClasses = c(x = "character",  :
  not all columns named in 'colClasses' exist

If you read in the help page -- you did read that before posting, did you?---
how 'colClasses' should be specified ,

    colClasses: character.  A vector of classes to be assumed for the
	      columns.  If unnamed, recycled as necessary.  If named, names
	      are matched with unspecified values being taken to be ‘NA’.

	      Possible values are ..................

and the 'x' and 'y' names you used, are matched with the
colnames ... which on the other hand are "V1" and "V2"  for
you, and so you provoke a warning.

Once you have read (and understood) the above part of the help
page, it becomes, easy, no?

> tt <- read.table(textConnection("a 3.14"), colClasses = c("character","numeric"))
> t2 <- read.table(textConnection("a 3.14"), colClasses=c(x="character",y="numeric"), col.names=c("x","y"))
> t2
  x    y
1 a 3.14

i.e., no warning in both of these two cases.  

So please, please, PLEASE: at least non-beginners like you *should*
take the effort to read the help page (and report if these seem
incomplete or otherwise improvable)... 

Martin Maechler
ETH Zurich

More information about the R-help mailing list