[Rd] unexpected error steming from print.data.frame

Fellows, Ian ifellows at ucsd.edu
Tue Oct 13 06:40:46 CEST 2009

Hello All,

At the suggestion of commenters on a discussion at stack overflow ( http://stackoverflow.com/questions/1535021/whats-the-biggest-r-gotcha-youve-run-across/1535433#1535433 ), I'm forwarding the following behavior report to this list. 

R Session:
> a<-data.frame(c(1,2,3,4),c(4,3,2,1))
> a<-a[-3,]
> a
  c.1..2..3..4. c.4..3..2..1.
1             1             4
2             2             3
4             4             1
> a[4,1]<-1
> a
Error in data.frame(c.1..2..3..4. = c("1", "2", "4", "1"), c.4..3..2..1. = c(" 4",  : 
  duplicate row.names: 4

What's going on:
    1. A four row data.frame is created, so the rownames are c(1,2,3,4) 
    2. The third row is deleted, so the rownames are c(1,2,4) 
    3. A fourth row is added, and R automatically sets the row name equal to the index i.e. 4, so the row names are c(1,2,4,4).
    4. print.data.frame throws an error because it requires unique row names

It seems to me that either R should automatically generate a unique row names, or print.data.frame should accept duplicates. Looking at the manual 2.3.2, it is unclear whether row names are required to be unique, but the help page for data.frame states: "A data frame is a list of variables of the same number of rows with unique row names,..." This implies that a[4,1]<-1 creates an invalid data.frame object.

Ian Fellows

More information about the R-devel mailing list