[R] change character to factor in data frame

Petr PIKAL petr.pikal at precheza.cz
Wed Sep 9 13:06:52 CEST 2009


Dear all

I have a simple problem which I thought is easy to solve but what I tried 
did not work. I want to change character variables to factor in data 
frame. It goes easily from factor to character, but I am stuck in how to 
do backwards conversion.

Here is an example

irisf<-iris
irisf[,2]<-factor(irisf[,2]) # create second factor

str(irisf)
'data.frame':   150 obs. of  5 variables:
 $ Sepal.Length: num  5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
 $ Sepal.Width : Factor w/ 23 levels "2","2.2","2.3",..: 15 10 12 11 16 19 
14 14 9 11 ...
 $ Petal.Length: num  1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...
 $ Petal.Width : num  0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
 $ Species     : Factor w/ 3 levels "setosa","versicolor",..: 1 1 1 1 1 1 
1 1 1 1 ...
index<-sapply(irisf, is.factor)
irisf[,index]<-sapply(irisf[,index], as.character)

str(irisf)
'data.frame':   150 obs. of  5 variables:
 $ Sepal.Length: num  5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
 $ Sepal.Width : chr  "3.5" "3" "3.2" "3.1" ...
 $ Petal.Length: num  1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...
 $ Petal.Width : num  0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
 $ Species     : chr  "setosa" "setosa" "setosa" "setosa" ...

I hoped that backwards conversion would be strightforward but...

irisf[,index]<-sapply(irisf[,index], as.factor)
str(irisf)
'data.frame':   150 obs. of  5 variables:
 $ Sepal.Length: num  5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
 $ Sepal.Width : chr  "3.5" "3" "3.2" "3.1" ...
 $ Petal.Length: num  1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...
 $ Petal.Width : num  0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
 $ Species     : chr  "setosa" "setosa" "setosa" "setosa" ...

I want to get both ch columns converted to factor but sapply produces 
character variables (which is documented as it produces matrix or vector)

> R.Version()
$platform
[1] "i386-pc-mingw32"

$arch
[1] "i386"

$os
[1] "mingw32"

$system
[1] "i386, mingw32"

$status
[1] "Under development (unstable)"

$major
[1] "2"

$minor
[1] "10.0"

$year
[1] "2009"

$month
[1] "07"

$day
[1] "15"

$`svn rev`
[1] "48932"

$language
[1] "R"

$version.string
[1] "R version 2.10.0 Under development (unstable) (2009-07-15 r48932)"

Regards
Petr




More information about the R-help mailing list