[R] rbind data.frames with character vectors?

Prof. Brian Ripley ripley at stats.ox.ac.uk
Mon Mar 31 19:23:58 CEST 2003


That is not how you are intended to put character strings in data frames 
in S.  Rather, there is

A <- data.frame(a=1, b=I("A"))
B <- data.frame(a=2, b=I("B"))
AB <- rbind(A,B)

etc works (at least in R-devel)

Using $ on data frame is underhand, and avoids some of the consistency 
checks.

We are planning to use I("foo") to put the column in as a character 
column and use it consistently, but for 1.8.0 not 1.7.x


On Mon, 31 Mar 2003, Spencer Graves wrote:

> "rbind(A, B)" converts character columns of A and B to factors.  This 
> means that "A <- rbind(A, B)" generates NAs unless the character strings 
> in B are already levels of the corresponding columns of A.
> 
> I've got a work-around, but I'm not happy with it.  What do you suggest?
> 
> Example:
> 
>  > A <- data.frame(a=1)
>  > A$b <- "A"
>  > B <- data.frame(a=2)
>  > B$b <- "B"
>  > sapply(A, data.class)
>            a           b
>    "numeric" "character"
>  > AB <- rbind(A,B)
>  > sapply(AB, data.class)
>          a         b
> "numeric"  "factor"
>  > C. <- data.frame(a=3)
>  > C.$b <- "C"
>  > rbind(AB, C.)
>      a    b
> 1   1    A
> 11  2    B
> 111 3 <NA>
> Warning message:
> invalid factor level, NAs generated in: "[<-.factor"(*tmp*, ri, value = 
> "C")
>  > sapply(rbind(AB, C.), data.class)
>          a         b
> "numeric"  "factor"
> Warning message:
> invalid factor level, NAs generated in: "[<-.factor"(*tmp*, ri, value = 
> "C")
> 
> Thanks,
> Spencer Graves
> p.s.  This example produces the desired result in S-Plus 2000 and 6.1 
> Professional for Windows 2000.

I am not at clear sure that is intentional, though.

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595



More information about the R-help mailing list