[R] Create a categorical variable using the deciles of data
Rui Barradas
ru|pb@rr@d@@ @end|ng |rom @@po@pt
Tue Jun 14 15:00:14 CEST 2022
Hello,
I have recreated the data.frame giving the column a name.
Here are two ways, both based on ?pretty:
data_catigocal <- data.frame(X = 1:50000)
pretty(data_catigocal$X, n = 10)
#> [1] 0 5000 10000 15000 20000 25000 30000 35000 40000 45000 50000
1. Use ?cut to create a factor with 10 levels then assign the labels.
group_vector <-
c('0-10','11-20','21-30','31-40','41-50','51-60','61-70','71-80','81-90','91-100')
data_catigocal$decile <- with(data_catigocal, cut(X, breaks = pretty(X,
n = 10), include.lowest = TRUE))
data_catigocal$decile <- factor(data_catigocal$decile, labels =
group_vector)
head(data_catigocal)
#> X decile
#> 1 1 0-10
#> 2 2 0-10
#> 3 3 0-10
#> 4 4 0-10
#> 5 5 0-10
#> 6 6 0-10
tail(data_catigocal)
#> X decile
#> 49995 49995 91-100
#> 49996 49996 91-100
#> 49997 49997 91-100
#> 49998 49998 91-100
#> 49999 49999 91-100
#> 50000 50000 91-100
2. Use ?findInterval to bin the data and coerce to factor with the
appropriate levels.
data_catigocal$decile <- findInterval(data_catigocal$X,
pretty(data_catigocal$X, n = 10), rightmost.closed = TRUE)
data_catigocal$decile <- factor(data_catigocal$decile, labels =
group_vector)
The results are the same.
Hope this helps,
Rui Barradas
Às 12:28 de 14/06/2022, anteneh asmare escreveu:
> I want Create a categorical variable using the deciles of the
> following data frame to divide the individuals into 10 groups equally.
> I try the following codes
> data_catigocal<-data.frame(c(1:50000))
> # create categorical vector using deciles
> group_vector <-
> c('0-10','11-20','21-30','31-40','41-50','51-60','61-70','71-80','81-90','91-100')
> # Add categorical variable to the data_catigocal
> data_catigocal$decile <- factor(group_vector)
> # print data frame
> data_catigocal
>
> can any one help me with the r code
> Kind regards,
> Hana
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list