[R] how to apply the function cut( ) to many columns in a data.frame?

Chuck Cleland ccleland at optonline.net
Thu Mar 1 10:40:29 CET 2007


ahimsa campos-arceiz wrote:
> Dear useRs,
> 
> In a data.frame (df) I have several columns (x1, x2, x3....xn) containing
> data as a continuous numerical response:
> 
> df
>  var     x1    x2     x3
>   1    143   147   137
>   2      93    93   117
>   3    164    39   101
>   4    123   118    97
>   5     63   125     97
>   6    129    83   124
>   7    123    93   136
>   8    123    80     79
>   9     89   107   150
> 10     78    95    121
> 
> I want to classify the values in the columns x1, x2, etc, into bins of fix
> margins (0-5, 5-10, ....). For one vector I can do it easily with the
> function cut:
> 
>> df$x1 <- cut(df$x1, br=5*(0:40), labels=5*(1:40))
>> df$x1
>  [1] 145 95  165 125 65  130 125 125 90  80
> 40 Levels: 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100 ...
> 200
> 
> However if I try to use a subset of my data.frame:
> 
> df[,3:4] <- cut(df[,3:4], br=5*(0:40), labels=5*(1:40))
> 
> Error in cut.default(df[, 3:4], br = 5 * (0:40), labels = 5 * (1:40)) :
>         'x' must be numeric
> 
> 
> How can I make this work with data frames in which I want to apply the
> function cut( ) to many columns in a data.frame?

  You have an answer within your question - use one of the various
"apply" functions.  For example:

lapply(df[,3:4], function(x){cut(x, br=5*(0:40), labels=5*(1:40))})

?lapply
?sapply
?apply

> I guess that I might have to use something like for ( ) (which I'm not
> familiar with), but maybe you know a straight forward method to use with
> data.frames.
> 
> 
> Thanks a lot!
> 
> Ahimsa
> 
> *********************************************
> 
> # data
> var <- 1:10
> x1 <- rnorm(10, mean=100, sd=25)
> x2 <- rnorm(10, mean=100, sd=25)
> x3 <- rnorm(10, mean=100, sd=25)
> df <- data.frame(var,x1,x2,x3)
> df
> 
> # classifying the values of the vector df$x1 into bins of width 5
> df$x1 <- cut(df$x1, br=5*(0:40), labels=5*(1:40))
> df$x1
> 
> # trying it a subset of the data.frame
> df[,3:4] <- cut(df[,3:4], br=5*(0:40), labels=5*(1:40))
> df[,3:4] 

-- 
Chuck Cleland, Ph.D.
NDRI, Inc.
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894



More information about the R-help mailing list