[R] Opinion: Why I find factors convenient to use
Jeff Newmiller
jdnewmil at dcn.davis.CA.us
Fri Aug 17 19:58:02 CEST 2012
I don't know if my recent post on this prompted your post, but I don't see much to argue with in your discussion. I find factors to be useful for managing display and some kinds of analysis.
However, I find them mostly a handicap when importing, merging, and handling data QC. Therefore I delay conversion until late in the game... but usually I do eventually convert in most cases.
---------------------------------------------------------------------------
Jeff Newmiller The ..... ..... Go Live...
DCN:<jdnewmil at dcn.davis.ca.us> Basics: ##.#. ##.#. Live Go...
Live: OO#.. Dead: OO#.. Playing
Research Engineer (Solar/Batteries O.O#. #.O#. with
/Software/Embedded Controllers) .OO#. .OO#. rocks...1k
---------------------------------------------------------------------------
Sent from my phone. Please excuse my brevity.
Bert Gunter <gunter.berton at gene.com> wrote:
>Folks:
>
>Over the years, many people -- including some who I would consider
>real expeRts -- have criticized factors and advocated the use
>(sometimes exclusively) of character vectors instead. I would just
>like to point out that, for me, factors provide one feature that I
>find to be very convenient: ordering of levels. **
>
>As an example, suppose one has a character vector of labels "small,"
>medium", and "large". Then most R functions (e.g. tapply()) will
>display results involving this vector in alphabetical order, which I
>think most would view as undesirable. By converting to a factor with
>levels in the logical order, displays will automatically be "logical."
>For example:
>
>> x <- sample(c("small","medium","large"),12,rep=TRUE)
>> table(x)
>x
> large medium small
> 2 3 7
>> y <- factor(x,lev=c("small","medium","large")) ##ordered() also would
>do, but is not necessary for this
>> table(y)
>y
> small medium large
> 7 3 2
>
>Naturally, this is just my opinion, and I understand why lots of smart
>people find factors irritating (at least!). So contrary opinions
>cheerily welcomed. But perhaps these comments might be helpful to
>those who have been "bitten" by factors or just wonder what all the
>fuss is about.
>
>** Another advantage is reduced storage space, I believe. Please
>correct if wrong.
>
>Cheers,
>Bert
>
>--
>
>Bert Gunter
>Genentech Nonclinical Biostatistics
>
>Internal Contact Info:
>Phone: 467-7374
>Website:
>http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm
>
>______________________________________________
>R-help at r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list