[R] Opinion: Why I find factors convenient to use

Bert Gunter gunter.berton at gene.com
Fri Aug 17 19:32:40 CEST 2012


Folks:

Over the years, many people -- including some who I would consider
real expeRts -- have criticized factors and advocated the use
(sometimes exclusively) of character vectors instead. I would just
like to point out that, for me, factors provide one feature that I
find to be very convenient: ordering of levels. **

As an example, suppose one has a character vector of labels "small,"
medium", and "large". Then most R functions (e.g. tapply()) will
display results involving this vector in alphabetical order, which I
think most would view as undesirable. By converting to a factor with
levels in the logical order, displays will automatically be "logical."
For example:

> x <- sample(c("small","medium","large"),12,rep=TRUE)
> table(x)
x
 large medium  small
     2      3      7
> y <- factor(x,lev=c("small","medium","large")) ##ordered() also would do, but is not necessary for this
> table(y)
y
 small medium  large
     7      3      2

Naturally, this is just my opinion, and I understand why lots of smart
people find factors irritating (at least!). So contrary opinions
cheerily welcomed. But perhaps these comments might be helpful to
those who have been "bitten" by factors or just wonder what all the
fuss is about.

** Another advantage is reduced storage space, I believe. Please
correct if wrong.

Cheers,
Bert

-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm



More information about the R-help mailing list