[R] How to get Quartiles when data contains both numeric variables and factors

R. Michael Weylandt michael.weylandt at gmail.com
Mon Oct 31 16:53:31 CET 2011


Let me caution against this approach as factors are generally stored
in R as integers but in a way that's not particularly meaningful for
most applications. This will silently make use of those internal
codes, the quantiles of which probably are not helpful for what you
are looking for. Furthermore, even if the factor integers make sense,
this might not give what you think it will.

E.g.

x <- factor(6:10, labels = 6:10)
quantile(as.numeric(x), 0.8)

Something like this *may* work but please take a moment to make
certain it's doing what you want. (I still would prefer my original
solution of something that treats factors entirely differently):

sapply(x, function(x) quantile(as.numeric(as.character(x)), c(0.01, 0.99)))

Michael

On Mon, Oct 31, 2011 at 11:39 AM, andrija djurovic <djandrija at gmail.com> wrote:
> Hi,
> you are almost there:
>
>>sapply(x, function(x) quantile(as.numeric(x), c(0.01, 0.99)))
>          x1          x2       x3        x4 x5 x6
> 1%  0.0351777 0.007628441 0.225533 0.4459064  1  1
> 99% 0.9938919 0.964901423 1.826894 3.6226944  3  2
>
> Andrija
>
> On Mon, Oct 31, 2011 at 2:09 PM, aajit75 <aajit75 at yahoo.co.in> wrote:
>
>> When data contains both factor and numeric variables, how to get quartiles
>> for all numeric variables?
>> n <- 100
>> x1 <- runif(n)
>> x2 <- runif(n)
>> x3 <- x1 + x2 + runif(n)/10
>> x4 <- x1 + x2 + x3 + runif(n)/10
>> x5 <- factor(sample(c('a','b','c'),n,replace=TRUE))
>> x6 <- factor(1*(x5=='a' | x5=='c'))
>> data1 <- cbind(x1,x2,x3,x4,x5,x6)
>> data <- data.frame(data1)
>>
>> data <- within(data,{x5 <- factor(x5)})
>> x <- data
>>
>> qs <- sapply(x, function(x) quantile(x, c(0.01, 0.99)))
>>
>> I get an error: Error in quantile.default(x, c(min_pct, max_pct)) : factors
>> are not allowed
>>
>> Thanks for the help.
>>
>>
>> --
>> View this message in context:
>> http://r.789695.n4.nabble.com/How-to-get-Quartiles-when-data-contains-both-numeric-variables-and-factors-tp3955750p3955750.html
>> Sent from the R help mailing list archive at Nabble.com.
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list