[R] Convert factor to numeric vector of labels
mckellercran at gmail.com
Tue Aug 14 21:47:45 CEST 2007
If we, the R community, are endeavoring to make R user friendly
(gasp!), I think that one of the first places to start would be in
setting stringsAsFactors = FALSE. Several times I've run into
instances of folks decrying R's "rediculous usage of memory" in
reading data, only to come to find out that these folks were
unknowingly importing certain columns as factors. The fix is easy once
you know it, but it isn't obvious to new users, and I'd bet that it
turns some % of people off of the program. Factors are not used often
enough to justify this default behavior in my opinion. When factors
are used, the user knows to treat the variable as a factor, and so it
can be done on a case-by-case (or should I say variable-by-variable?)
Is this a default that should be changed?
On 8/13/07, John Kane <jrkrideau at yahoo.ca> wrote:
> This is one of R's rather _endearing_ little
> idiosyncrasies. I ran into it a while ago.
> For some reason, possibly historical, the option
> "stringAsFactors" is set to TRUE.
> As Prof Ripley says FAQ 7.10 will tell you
> as.numeric(as.character(f)) # for a one-off conversion
> >From Gabor Grothendieck A one-off solution for a
> complete data.frame
> DF <- data.frame(let = letters[1:3], num = 1:3,
> stringsAsFactors = FALSE)
> str(DF) # to see what has happened.
> You can reset the option globally, see below. However
> you might want to read Gabor Grothendieck's comment
> about this in the thread referenced above since it
> could cause problems if you transfer files alot.
> Personally I went with the global option since I don't
> tend to transfer programs to other people and I was
> getting tired of tracking down errors in my programs
> caused by numeric and character variables suddenly
> deciding to become factors.
> >From Steven Tucker:
> You can also this option globally with
> options(stringsAsFactors = TRUE) # in
> --- Falk Lieder <falk.lieder at googlemail.com> wrote:
> > Hi,
> > I have imported a data file to R. Unfortunately R
> > has interpreted some
> > numeric variables as factors. Therefore I want to
> > reconvert these to numeric
> > vectors whose values are the factor levels' labels.
> > I tried
> > as.numeric(<factor>),
> > but it returns a vector of factor levels (i.e.
> > 1,2,3,...) instead of labels
> > (i.e. 0.71, 1.34, 2.61,…).
> > What can I do instead?
> > Best wishes, Falk
> R-help at stat.math.ethz.ch mailing list
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
Matthew C Keller
Virginia Institute for Psychiatric and Behavioral Genetics
More information about the R-help