[R] rbind data.frames with character vectors?

Prof Brian Ripley ripley at stats.ox.ac.uk
Wed Apr 2 09:12:50 CEST 2003


I have put some bug fixes in R-devel (to be 1.7.0) which make this work
(and sorted out quite a few other anomalies, not all of which work
correctly in current S-PLUS).

My advice remains to use I(), as I think there are other functions (e.g.  
merge, possibly) which may not expect `raw' character columns in data
frames.

On Mon, 31 Mar 2003, Prof. Brian Ripley wrote:

> That is not how you are intended to put character strings in data frames 
> in S.  Rather, there is
> 
> A <- data.frame(a=1, b=I("A"))
> B <- data.frame(a=2, b=I("B"))
> AB <- rbind(A,B)
> 
> etc works (at least in R-devel)
> 
> Using $ on data frame is underhand, and avoids some of the consistency 
> checks.
> 
> We are planning to use I("foo") to put the column in as a character 
> column and use it consistently, but for 1.8.0 not 1.7.x
> 
> 
> On Mon, 31 Mar 2003, Spencer Graves wrote:
> 
> > "rbind(A, B)" converts character columns of A and B to factors.  This 
> > means that "A <- rbind(A, B)" generates NAs unless the character strings 
> > in B are already levels of the corresponding columns of A.
> > 
> > I've got a work-around, but I'm not happy with it.  What do you suggest?
> > 
> > Example:
> > 
> >  > A <- data.frame(a=1)
> >  > A$b <- "A"
> >  > B <- data.frame(a=2)
> >  > B$b <- "B"
> >  > sapply(A, data.class)
> >            a           b
> >    "numeric" "character"
> >  > AB <- rbind(A,B)
> >  > sapply(AB, data.class)
> >          a         b
> > "numeric"  "factor"
> >  > C. <- data.frame(a=3)
> >  > C.$b <- "C"
> >  > rbind(AB, C.)
> >      a    b
> > 1   1    A
> > 11  2    B
> > 111 3 <NA>
> > Warning message:
> > invalid factor level, NAs generated in: "[<-.factor"(*tmp*, ri, value = 
> > "C")
> >  > sapply(rbind(AB, C.), data.class)
> >          a         b
> > "numeric"  "factor"
> > Warning message:
> > invalid factor level, NAs generated in: "[<-.factor"(*tmp*, ri, value = 
> > "C")
> > 
> > Thanks,
> > Spencer Graves
> > p.s.  This example produces the desired result in S-Plus 2000 and 6.1 
> > Professional for Windows 2000.
> 
> I am not at clear sure that is intentional, though.
> 
> 

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595



More information about the R-help mailing list