[R] How do I coerce numeric factor columns of data frame to vector?
Thomas W Blackwell
tblackw at umich.edu
Tue Sep 9 01:22:04 CEST 2003
Michael -
Because these columns are factors to begin with, using as.numeric()
alone will have unexpected results. See the section "Warning:" in
help("factor").
However, it is worth Murray asking himself WHY these columns are
factors to start with, rather than the expected numeric values.
One frequent source of this is using read.table() on a file
which contains column headers without setting header=T. Then,
the character string in the first row of each column prevents
numeric conversion of all of the other rows. Another possible
difficulty is an unusual missing value code, or commas in place
of decimal points, or anything else, somewhere in the file that
does not convert automatically to numeric. Maybe it's worth
editing the original data file before Murray reads it in.
Hmmm. I think I ought to have offered these many cents worth
with my earlier reply.
- tom blackwell - u michigan medical school - ann arbor -
On Mon, 8 Sep 2003, Michael A. Miller wrote:
> >>>>> "Murray" == Murray Jorgensen <maj at stats.waikato.ac.nz> writes:
>
> > I have just noticed that quite a few columns of a data
> > frame that I am working on are numeric factors. For
> > summary() purposes I want them to be vectors.
>
> Do you want them to be vectors or do you want numeric values? If
> the later, try as.numeric instead of as.vector:
>
> > as.vector(factor(rep(seq(4),3)))
> [1] "1" "2" "3" "4" "1" "2" "3" "4" "1" "2" "3" "4"
> > as.numeric(factor(rep(seq(4),3)))
> [1] 1 2 3 4 1 2 3 4 1 2 3 4
> > summary(as.vector(factor(rep(seq(4),3))))
> Length Class Mode
> 12 character character
> > summary(as.numeric(factor(rep(seq(4),3))))
> Min. 1st Qu. Median Mean 3rd Qu. Max.
> 1.00 1.75 2.50 2.50 3.25 4.00
>
> Mike
More information about the R-help
mailing list