[R] Opinion: Why I find factors convenient to use

PIKAL Petr petr.pikal at precheza.cz
Fri Aug 17 19:53:06 CEST 2012


I second to Bert's opinion, factors can be confusing, but they have quite nice features which can not be easily mimicked by plain character vectors. I find extremelly usefull possibility of manipulating its levels.

> fac<-factor(sample(letters[1:5], 20, replace=TRUE))
> fac
 [1] e e d d e e c e a e a e b b d e c c d b
Levels: a b c d e
> levels(fac)[2:4]<- "new.level"
> fac
 [1] e         e         new.level new.level e         e         new.level
 [8] e         a         e         a         e         new.level new.level
[15] new.level e         new.level new.level new.level new.level
Levels: a new.level e
>

Regards
Petr


________________________________________
Odesílate: r-help-bounces at r-project.org [r-help-bounces at r-project.org] za uživatele Bert Gunter [gunter.berton at gene.com]
Odesláno: 17. srpna 2012 19:32
To: r-help at r-project.org
Předmět: [R] Opinion: Why I find factors convenient to use

Folks:

Over the years, many people -- including some who I would consider
real expeRts -- have criticized factors and advocated the use
(sometimes exclusively) of character vectors instead. I would just
like to point out that, for me, factors provide one feature that I
find to be very convenient: ordering of levels. **

As an example, suppose one has a character vector of labels "small,"
medium", and "large". Then most R functions (e.g. tapply()) will
display results involving this vector in alphabetical order, which I
think most would view as undesirable. By converting to a factor with
levels in the logical order, displays will automatically be "logical."
For example:

> x <- sample(c("small","medium","large"),12,rep=TRUE)
> table(x)
x
 large medium  small
     2      3      7
> y <- factor(x,lev=c("small","medium","large")) ##ordered() also would do, but is not necessary for this
> table(y)
y
 small medium  large
     7      3      2

Naturally, this is just my opinion, and I understand why lots of smart
people find factors irritating (at least!). So contrary opinions
cheerily welcomed. But perhaps these comments might be helpful to
those who have been "bitten" by factors or just wonder what all the
fuss is about.

** Another advantage is reduced storage space, I believe. Please
correct if wrong.

Cheers,
Bert

--

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list