[Rd] Data frames and row names

Henrik Bengtsson hb at stat.berkeley.edu
Tue Aug 15 03:16:04 CEST 2006


In R-devel v2.4.0 NEWS:

    o	The 'row.names' of a data frame may be stored internally as an
	integer or character vector.  This can result in considerably
	more compact storage (and more logical row names from rbind)
	when the row.names are 1:nrow(x).  However, such data frames
	are not compatible with earlier versions of R: this can be
	ensured by supplying a character vector as 'row.names'.

This is great.

With row.names == NULL for 1:nrow(x) the storage would be even more
compact.  I noticed that the number of rows is inferred from row
names:

> dim.data.frame
function (x)
c(length(attr(x, "row.names")), length(x))
<environment: namespace:base>

but couldn't the number of rows be inferred from the first column, if
there are no row names?  I realize that this would break the case with
zero-column data frames, e.g.

> df <- data.frame(a=1:10)
> df[,-1]
NULL data frame with 10 rows.

...but maybe there is a way around that too.

Cheers

/H



More information about the R-devel mailing list