[R] Factor names & levels

Gabor Grothendieck ggrothendieck at myway.com
Mon Dec 22 03:27:48 CET 2003



Based on Peter's response, I think I may have misinterpreted 
Damon's query.   The methods I displayed in my last post in 
this thread were intended to make name a synonym for level. If
its desired that name act on factors in the same way that names 
act on vectors and lists then the methods I provided would not 
be correct and, as Peter points out, the other factor methods 
would have to be examined, as well, to ensure that they all 
work properly with names. 

I do have one other idea in terms of a workaround.  You could
represent your factor as a one column data frame.  The data
frame could then have row names which could be interpreted as
names of the factor.

For example,

f <- data.frame(f = c("A","B","A","C"))
row.names(f) <- letters[1:4]

You can now refer to the factor as f$f and the names as row.names(f).
For example,

> f <- data.frame(f = factor(c("A","B","A","C")))
> row.names(f) <- letters[1:4]
> f
  f
a A
b B
c A
d C
> row.names(f)
[1] "a" "b" "c" "d"
> f$f
[1] A B A C
Levels: A B C

This is all officially supported by R so it should not get you
into trouble although it does require that your program 
interpret it accordingly.

--- 
Date: 22 Dec 2003 02:30:52 +0100 
From: Peter Dalgaard <p.dalgaard at biostat.ku.dk>
To: <ggrothendieck at myway.com> 
Cc: <djw1005 at cam.ac.uk>, <R-help at stat.math.ethz.ch> 
Subject: Re: [R] Factor names & levels 

 
 
"Gabor Grothendieck" <ggrothendieck at myway.com> writes:

> For it to be well defined, there would need to be a names
> method and a names<- method for the factor class or else 
> the default methods would have to be able to handle factors.

Not only that but other methods for factors need to know about the
names and be able to modify them accordingly, e.g.

> getS3method("levels<-","factor")
function (x, value)
{
xlevs <- levels(x)
if (is.list(value)) #something
...
else {
...
nlevs <- xlevs <- as.character(value)
}
factor(xlevs[x], levels = unique(nlevs))
}

Here, xlevs[x] will not have the same names as x (it gets names from
xlevs if anything) so you'd have to have extra code for setting the
names on the result. 

(Rather interestingly, the factor() function does explicitly retain
names, so there are not quite as many places where they will be lost
as I would have expected.)

-- 
O__ ---- Peter Dalgaard Blegdamsvej 3 
c/ /'_ --- Dept. of Biostatistics 2200 Cph. N 
(*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907




More information about the R-help mailing list