[R] suggestion for make.names
Tom Minka
minka at stat.cmu.edu
Wed Jun 18 02:09:37 CEST 2003
I would like to suggest a modification to the make.names() function.
The current implementation has two problems:
1. It doesn't check if a name matches an R keyword (like "function").
2. The uniqueness algorithm is not invariant to concatenation.
In other words,
make.names(c("a","a","a"),unique=T) !=
make.names(c(make.names(c("a","a"),unique=T),"a"),unique=T)
The first problem means that you can construct a data frame for which
there is no valid formula:
lm(if~then, data.frame("if"=3,"then"=4))
The second problem means that you get funny row names when you build
up a data frame by concatenation:
rbind(data.frame(x=1),data.frame(x=2),data.frame(x=3)) !=
rbind(rbind(data.frame(x=1),data.frame(x=2)),data.frame(x=3))
I'm providing a new implementation (and documentation) of make.names which
fixes these problems. The uniqueness part is handled by make.unique, a
useful function in its own right. For example, the following code in
rbind.data.frame:
while(any(xj <- duplicated(rlabs)))
rlabs[xj] <- paste(rlabs[xj], 1:sum(xj), sep = "")
could be replaced by:
rlabs = make.unique(rlabs)
Another way to fix the first problem, by the way, would be to allow
quoting in formulas, e.g. lm("if" ~ then, ...). Currently, when you
do this it gives an error. This could eliminate the need for
make.names in many cases, by being able to quote names like
"New Jersey" or "log(x)".
Tom Minka
www.stat.cmu.edu/~minka/
More information about the R-help
mailing list