[R] how to change data type in data frame?

Marc Schwartz MSchwartz at MedAnalytics.com
Mon Oct 4 23:09:29 CEST 2004


On Mon, 2004-10-04 at 14:27, Auston_Wei at mdanderson.org wrote:
> Hi, list,
> 
> suppose i have such a data frame:
> 
> trash <- 
> data.frame(cbind(seq(1:5),c('a','a','b','a','b'),c('b','a','b','b','a')))
> names(trash) <- c('age','typeI','typeII')
> 
> and I want to change all 'a's to be 0 and 'b's to be 1. 
> 
> temp <- as.matrix(trash)
> temp[temp=='a'] <- 0
> temp[temp=='b'] <- 1
> temp <- data.frame(temp)
> 
> the problem was that temp$typeI and temp$typeII were still factors, 
> whereas I want numeric type. How can I make it?
> 
> Thanks,
> Auston

First, you need to be careful relative to the way in which you are
creating the data frame.

'trash', as you have created it, is a data frame of all factors:

> str(trash)
`data.frame':	5 obs. of  3 variables:
 $ age   : Factor w/ 5 levels "1","2","3","4",..: 1 2 3 4 5
 $ typeI : Factor w/ 2 levels "a","b": 1 1 2 1 2
 $ typeII: Factor w/ 2 levels "a","b": 2 1 2 2 1


This is because you used cbind(), which will first result in a matrix of
characters:

> cbind(seq(1:5), c('a','a','b','a','b'), c('b','a','b','b','a'))
     [,1] [,2] [,3]
[1,] "1"  "a"  "b" 
[2,] "2"  "a"  "a" 
[3,] "3"  "b"  "b" 
[4,] "4"  "a"  "b" 
[5,] "5"  "b"  "a" 

and then this matrix is converted into a data frame. In the process of
converting the character matrix into a data frame, the characters are
converted into factors.

Thus, if you want to preserve the multiple data types, for which a data
frame is used, you can do the following, noting that you can name the
columns here in the same step:

trash <- data.frame(age = 1:5,
                    typeI = I(c('a','a','b','a','b')),
                    typeII = I(c('b','a','b','b','a')))

In the above, note the use of "I(...)", which preserves the character
nature of typeI and typeII:

> str(trash)
`data.frame':	5 obs. of  3 variables:
 $ age   : int  1 2 3 4 5
 $ typeI :Class 'AsIs'  chr [1:5] "a" "a" "b" "a" ...
 $ typeII:Class 'AsIs'  chr [1:5] "b" "a" "b" "b" ...

Once you have the data frame in this format, you can then do your
replacements. You could either do the conversion one column at a time,
as you have done above, or you can do them in one step:

trash[, 2:3] <- ifelse(trash[, 2:3] == 'a', 0, 1)

> trash
  age typeI typeII
1   1     0      1
2   2     0      0
3   3     1      1
4   4     0      1
5   5     1      0

> str(trash)
`data.frame':	5 obs. of  3 variables:
 $ age   : int  1 2 3 4 5
 $ typeI : num  0 0 1 0 1
 $ typeII: num  1 0 1 1 0

You should note however, that depending upon what you intend to do with
the data in typeI and typeII, you may want to keep them as factors,
since many functions (ie. modeling functions) utilize the factor data
type specifically.

See ?data.frame and ?I for more information.

HTH,

Marc Schwartz




More information about the R-help mailing list