[R] Factor to numeric conversion - as.numeric(levels(f))[f] - Language definition seems to say to not use this.
Peter Ehlers
ehlers at ucalgary.ca
Mon Apr 1 23:29:55 CEST 2013
On 2013-04-01 13:08, Matthew Lundberg wrote:
> Note the edited subject line! I don't know why I typed it as it was before.
>
> This says that as.numeric(as.character(f)) will work regardless of the
> implementation, and I agree.
>
> It's the recommendation to use as.numeric(levels(f))[f] that has me
> wondering about section 2.3.1 of the language definition. I expect that
> this idiom is in widespread use, and perhaps the language definition
> should be changed.
I think that I may be getting an inkling of what your complaint is:
section 2.3.1 talks about
"an integer array to specify the _actual_ levels" [emphasis added]
and
"a second array of _names_ that are mapped to the integers". [ditto]
When you object to the use of "as.numeric(levels(f))[f]", are you
assuming that "levels(f)" is the set of _integers_ or the set of
_names_?
Anyway, it's indeed the set of names, as returned by the levels()
function.
Peter Ehlers
>
>
> On Mon, Apr 1, 2013 at 2:58 PM, Bert Gunter <gunter.berton at gene.com
> <mailto:gunter.berton at gene.com>> wrote:
>
> Yup. Note also:
>
> > as.character.factor
> function (x, ...)
> levels(x)[x]
>
> But of course this is OK, since this can change if the implementation
> does. Which is the whole point, of course.
>
> -- Bert
>
>
>
> On Mon, Apr 1, 2013 at 12:16 PM, Matthew Lundberg
> <matthew.k.lundberg at gmail.com <mailto:matthew.k.lundberg at gmail.com>>
> wrote:
> >
> > When used as an index, the factor is implicitly converted to
> integer. In
> > the expression as.numeric(levels(f))[f], the vector
> as.numeric(levels(f))
> > is indexed by as.integer(f).
> >
> > This appears to rely on the current implementation, as mentioned
> in section
> > 2.3.1 of the language definition.
> >
> >
> > On Mon, Apr 1, 2013 at 1:49 PM, Peter Ehlers <ehlers at ucalgary.ca
> <mailto:ehlers at ucalgary.ca>> wrote:
> >
> > > On 2013-04-01 10:48, Matthew Lundberg wrote:
> > >
> > >> These two seem to be at odds. Is this the case?
> > >>
> > >> From help(factor) - section Warning:
> > >>>
> > >>
> > >> To transform a factor f to approximately its original numeric
> values,
> > >> as.numeric(levels(f))[f] is recommended and slightly more
> efficient than
> > >> as.numeric(as.character(f)).
> > >>
> > >> From the language definition - section 2.3.1:
> > >>>
> > >>
> > >> Factors are currently implemented using an integer array to
> specify the
> > >> actual levels and
> > >> a second array of names that are mapped to the integers. Rather
> > >> unfortunately users often
> > >> make use of the implementation in order to make some
> calculations easier.
> > >> This, however,
> > >> is an implementation issue and is not guaranteed to hold in all
> > >> implementations of R.
> > >>
> > >
> > > Hint:
> > >
> > > f <- factor(sample(5, 10, TRUE))
> > > as.numeric(levels(f))[f]
> > >
> > > g <- factor(sample(letters[1:5], 10, TRUE))
> > > as.numeric(levels(g))[g]
> > >
> > > Peter Ehlers
> > >
> > >
> > >
> > >> [[alternative HTML version deleted]]
> > >>
> > >> ______________________________**________________
> > >> R-help at r-project.org <mailto:R-help at r-project.org> mailing list
> > >>
> https://stat.ethz.ch/mailman/**listinfo/r-help<https://stat.ethz.ch/mailman/listinfo/r-help>
> > >> PLEASE do read the posting guide http://www.R-project.org/**
> > >> posting-guide.html <http://www.R-project.org/posting-guide.html>
> > >> and provide commented, minimal, self-contained, reproducible code.
> > >>
> > >>
> > >
> >
> > [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help at r-project.org <mailto:R-help at r-project.org> mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
>
>
>
> --
>
> Bert Gunter
> Genentech Nonclinical Biostatistics
>
> Internal Contact Info:
> Phone: 467-7374
> Website:
> http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm
>
>
More information about the R-help
mailing list