[Rd] Behaviour of read.table with empty columns

Prof Brian Ripley ripley at stats.ox.ac.uk
Wed May 9 18:04:58 CEST 2007


On Wed, 9 May 2007, John Fox wrote:

> Dear r-devel list members,
>
> I stumbled across the following behaviour of read.table() recently: 
Suppose
> that I have the data
>
> a  " " ""
> "" ""  ""
>
> in a file or copied to the clipboard, and issue the command
>
>> DF <- read.table("clipboard")
>> DF
>  V1 V2 V3
> 1  a NA NA
> 2    NA NA
>
>> is.na(DF)
>        V1   V2   V3
> [1,] FALSE TRUE TRUE
> [2,] FALSE TRUE TRUE
>
> I was surprised by the NAs. Note that they occur only when a column consists
> entirely of empty strings or strings composed of blanks.
>
> On the other hand
>
>> data.frame(A=c("", "", ""))
>  A
> 1
> 2
> 3
>
> works as I would have expected.

How did you expect R to know that "" meant a character column?  You are 
allowed to quote any type of column, so as far as read.table is concerned 
the columns is entirely empty and so its type is unknown.  It defaults to 
the simplest possible type, logical.

The answer is I think to use colClasses="character".

It is probably slightly more accurate to say that if colClasses is not 
given, all columns are read as character columns, and then converted to 
the simplest possible type.  In earlier versions of R you could get NULL 
columns (if there were no rows at all), but now the simplest is logical.

Brian

> A work-around for me was
>
>> DF[is.na(DF)] <- ""
>> DF
>  V1 V2 V3
> 1  a
> 2
>
> But, as I said, I found the behaviour of read.table() puzzling.
>
> All this is with R 2.5.0 on a Windows XP Pro SP 2 system.
>
> Comments?
>
> Thanks,
> John
>
> --------------------------------
> John Fox, Professor
> Department of Sociology
> McMaster University
> Hamilton, Ontario
> Canada L8S 4M4
> 905-525-9140x23604
> http://socserv.mcmaster.ca/jfox
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595



More information about the R-devel mailing list