[R] Thougt I understood factors but??
Liaw, Andy
andy_liaw at merck.com
Mon Mar 1 20:23:04 CET 2010
From: David Winsemius
>
> On Mar 1, 2010, at 12:07 PM, Nicholas Lewin-Koh wrote:
>
> > Hi,
> > consider the following
> >> a<-gl(3,3,9)
> >> a
> > [1] 1 1 1 2 2 2 3 3 3
> > Levels: 1 2 3
> >> levels(a)<-3:1
>
> That may look like the same re-ordered factor but you instead merely
> re-labeled each level where the internal numbers that represent the
> factor values stayed the same..
>
> >> a
> > [1] 3 3 3 2 2 2 1 1 1
Indeed this is one of the (few, I believe) traps of R, because:
R> a
[1] 3 3 3 2 2 2 1 1 1
Levels: 3 2 1
R> as.numeric(a)
[1] 1 1 1 2 2 2 3 3 3
R> as.numeric(as.character(a))
[1] 3 3 3 2 2 2 1 1 1
Andy
> > Levels: 3 2 1
> >> a<-gl(3,3,9)
> >> factor(a,levels=3:1)
>
> That is the right way IMO to safely change the ordering of
> the levels
> without changing the "semantics" or the "meaning" of the
> factor level
> assignments.
>
> Try:
>
> levels(a) <- letters[4:6]
> a
>
> [1] d d d e e e f f f
> Levels: d e f
> > a <- factor(a, levels=letters[1:3])
> > a
> [1] <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
> Levels: a b c
>
> Using the second form sets any non-existent (in the new level
> vector)
> factor values to NA's, in this case all of them. It is better in my
> mind to get assignments to NA than it would be to get assignments to
> incorrect levels.
>
> > b <-factor(c(0,0,0,0, 1, 1))
> > b
> [1] 0 0 0 0 1 1
> Levels: 0 1
> > levels(b) <-c(1,0)
> > b
> [1] 1 1 1 1 0 0 # No longer the same "meaning"
> Levels: 1 0
> > b <-factor(c(0,0,0,0, 1, 1))
> > b<- factor(b, levels=c(1,0))
> > b
> [1] 0 0 0 0 1 1
> Levels: 1 0 # Only the ordering has changed but the meaning is
> the same
>
>
> This is especially so when working with factors as components of
> data.frames.
>
>
> --
> David.
>
>
>
> > [1] 1 1 1 2 2 2 3 3 3
> > Levels: 3 2 1
> > It is probably something obvious I missed, but reading the
> > documentation
> > of factor, and levels I would have thought
> > that both should produce the same output as
> > factor(a,levels=3:1)
> > [1] 1 1 1 2 2 2 3 3 3
> > Levels: 3 2 1
> > The closest I could find in a quick search was this
> > http://tolstoy.newcastle.edu.au/R/e5/help/08/09/2503.html
> >
> > Thanks
> > Nicholas
> >
> > sessionInfo()
> > R version 2.10.1 Patched (2009-12-20 r50794)
> > x86_64-unknown-linux-gnu
> >
> > locale:
> > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
> > [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
> > [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8
> > [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
> > [9] LC_ADDRESS=C LC_TELEPHONE=C
> > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
> > attached base packages:
> > [1] splines tcltk stats graphics grDevices utils
> > datasets
> > [8] methods base
> >
> > other attached packages:
> > [1] mvtnorm_0.9-9 latticeExtra_0.6-9 RColorBrewer_1.0-2
> > lattice_0.18-3
> > [5] nlme_3.1-96 XML_2.6-0 gsubfn_0.5-0
> > proto_0.3-8
> >
> > loaded via a namespace (and not attached):
> > [1] grid_2.10.1 tools_2.10.1
> >
> > ______________________________________________
> > R-help at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> David Winsemius, MD
> Heritage Laboratories
> West Hartford, CT
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
Notice: This e-mail message, together with any attachme...{{dropped:10}}
More information about the R-help
mailing list