[R] how to apply the function cut( ) to many columns in a data.frame?
Liaw, Andy
andy_liaw at merck.com
Thu Mar 1 16:30:29 CET 2007
From: Chuck Cleland
>
> ahimsa campos-arceiz wrote:
> > Dear useRs,
> >
> > In a data.frame (df) I have several columns (x1, x2, x3....xn)
> > containing data as a continuous numerical response:
> >
> > df
> > var x1 x2 x3
> > 1 143 147 137
> > 2 93 93 117
> > 3 164 39 101
> > 4 123 118 97
> > 5 63 125 97
> > 6 129 83 124
> > 7 123 93 136
> > 8 123 80 79
> > 9 89 107 150
> > 10 78 95 121
> >
> > I want to classify the values in the columns x1, x2, etc,
> into bins of
> > fix margins (0-5, 5-10, ....). For one vector I can do it
> easily with
> > the function cut:
> >
> >> df$x1 <- cut(df$x1, br=5*(0:40), labels=5*(1:40))
> >> df$x1
> > [1] 145 95 165 125 65 130 125 125 90 80 40 Levels: 5 10
> 15 20 25
> > 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100 ...
> > 200
> >
> > However if I try to use a subset of my data.frame:
> >
> > df[,3:4] <- cut(df[,3:4], br=5*(0:40), labels=5*(1:40))
> >
> > Error in cut.default(df[, 3:4], br = 5 * (0:40), labels = 5
> * (1:40)) :
> > 'x' must be numeric
> >
> >
> > How can I make this work with data frames in which I want
> to apply the
> > function cut( ) to many columns in a data.frame?
>
> You have an answer within your question - use one of the
> various "apply" functions. For example:
>
> lapply(df[,3:4], function(x){cut(x, br=5*(0:40), labels=5*(1:40))})
Or perhaps a bit more simply:
lapply(df[, 3:4], cut, br=5*(0:40), labels=5*(1:40)))
and if a data frame is desired as output, wrap the above in
as.data.frame().
(Just keep in mind that a data frame is like a list.)
Andy
> ?lapply
> ?sapply
> ?apply
>
> > I guess that I might have to use something like for ( )
> (which I'm not
> > familiar with), but maybe you know a straight forward method to use
> > with data.frames.
> >
> >
> > Thanks a lot!
> >
> > Ahimsa
> >
> > *********************************************
> >
> > # data
> > var <- 1:10
> > x1 <- rnorm(10, mean=100, sd=25)
> > x2 <- rnorm(10, mean=100, sd=25)
> > x3 <- rnorm(10, mean=100, sd=25)
> > df <- data.frame(var,x1,x2,x3)
> > df
> >
> > # classifying the values of the vector df$x1 into bins of width 5
> > df$x1 <- cut(df$x1, br=5*(0:40), labels=5*(1:40))
> > df$x1
> >
> > # trying it a subset of the data.frame df[,3:4] <- cut(df[,3:4],
> > br=5*(0:40), labels=5*(1:40)) df[,3:4]
>
> --
> Chuck Cleland, Ph.D.
> NDRI, Inc.
> 71 West 23rd Street, 8th floor
> New York, NY 10010
> tel: (212) 845-4495 (Tu, Th)
> tel: (732) 512-0171 (M, W, F)
> fax: (917) 438-0894
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
>
------------------------------------------------------------------------------
Notice: This e-mail message, together with any attachments,...{{dropped}}
More information about the R-help
mailing list