[R] How do I coerce numeric factor columns of data frame to vector?

Thomas W Blackwell tblackw at umich.edu
Tue Sep 9 01:22:04 CEST 2003


Michael  -

Because these columns are factors to begin with, using  as.numeric()
alone will have unexpected results.  See the section "Warning:" in
help("factor").

However, it is worth Murray asking himself WHY these columns are
factors to start with, rather than the expected numeric values.
One frequent source of this is using  read.table()  on a file
which contains column headers without setting  header=T.  Then,
the character string in the first row of each column prevents
numeric conversion of all of the other rows.  Another possible
difficulty is an unusual missing value code, or commas in place
of decimal points, or anything else, somewhere in the file that
does not convert automatically to numeric.  Maybe it's worth
editing the original data file before Murray reads it in.

Hmmm.  I think I ought to have offered these many cents worth
with my earlier reply.

-  tom blackwell  -  u michigan medical school  -  ann arbor  -

On Mon, 8 Sep 2003, Michael A. Miller wrote:

> >>>>> "Murray" == Murray Jorgensen <maj at stats.waikato.ac.nz> writes:
>
>     > I have just noticed that quite a few columns of a data
>     > frame that I am working on are numeric factors. For
>     > summary() purposes I want them to be vectors.
>
> Do you want them to be vectors or do you want numeric values?  If
> the later, try as.numeric instead of as.vector:
>
> > as.vector(factor(rep(seq(4),3)))
>  [1] "1" "2" "3" "4" "1" "2" "3" "4" "1" "2" "3" "4"
> > as.numeric(factor(rep(seq(4),3)))
>  [1] 1 2 3 4 1 2 3 4 1 2 3 4
> > summary(as.vector(factor(rep(seq(4),3))))
>    Length     Class      Mode
>        12 character character
> > summary(as.numeric(factor(rep(seq(4),3))))
>    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
>    1.00    1.75    2.50    2.50    3.25    4.00
>
> Mike




More information about the R-help mailing list