[R] Create a categorical variable using the deciles of data

Rui Barradas ru|pb@rr@d@@ @end|ng |rom @@po@pt
Tue Jun 14 15:00:14 CEST 2022


Hello,

I have recreated the data.frame giving the column a name.
Here are two ways, both based on ?pretty:


data_catigocal <- data.frame(X = 1:50000)
pretty(data_catigocal$X, n = 10)
#>  [1]     0  5000 10000 15000 20000 25000 30000 35000 40000 45000 50000


1. Use ?cut to create a factor with 10 levels then assign the labels.


group_vector <-
c('0-10','11-20','21-30','31-40','41-50','51-60','61-70','71-80','81-90','91-100')

data_catigocal$decile <- with(data_catigocal, cut(X, breaks = pretty(X, 
n = 10), include.lowest = TRUE))
data_catigocal$decile <- factor(data_catigocal$decile, labels = 
group_vector)

head(data_catigocal)
#>   X decile
#> 1 1   0-10
#> 2 2   0-10
#> 3 3   0-10
#> 4 4   0-10
#> 5 5   0-10
#> 6 6   0-10
tail(data_catigocal)
#>           X decile
#> 49995 49995 91-100
#> 49996 49996 91-100
#> 49997 49997 91-100
#> 49998 49998 91-100
#> 49999 49999 91-100
#> 50000 50000 91-100


2. Use ?findInterval to bin the data and coerce to factor with the 
appropriate levels.



data_catigocal$decile <- findInterval(data_catigocal$X, 
pretty(data_catigocal$X, n = 10), rightmost.closed = TRUE)
data_catigocal$decile <- factor(data_catigocal$decile, labels = 
group_vector)


The results are the same.

Hope this helps,

Rui Barradas



Às 12:28 de 14/06/2022, anteneh asmare escreveu:
> I want Create a categorical variable using the deciles of the
> following data frame to divide the individuals into 10 groups equally.
> I try the following codes
> data_catigocal<-data.frame(c(1:50000))
> # create categorical vector using deciles
> group_vector <-
> c('0-10','11-20','21-30','31-40','41-50','51-60','61-70','71-80','81-90','91-100')
> # Add categorical variable to the data_catigocal
> data_catigocal$decile <- factor(group_vector)
> # print data frame
> data_catigocal
> 
> can any one help me with the r code
> Kind regards,
> Hana
> 
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list