# [R] Interquartile Range

Bert Gunter bgunter.4567 at gmail.com
Wed Apr 20 05:53:26 CEST 2016

```???

IQR returns a single number.

> IQR(rnorm(10))
[1] 1.090168

"I could have used average, min, max, they all would have returned the
same thing., "

I can only respond: huh?? Are all your values identical?

You really need to provide a small reproducible example as requested
by the posting guide -- I certainly don't get it, and I'm done
guessing. Maybe others will see what I am missing and say something
useful. I clearly can't.

On Tue, Apr 19, 2016 at 5:29 PM, Michael Artz wrote:
> Again, IQR returns two both a .25 and a .75 value and it failed, which is
> why I didn't use it before. Also, the first function just returns tha same
> value repeating.  Since they are the same, before the second call, using the
> mode function is just a way to grab one value. I could have used average,
> min, max, they all would have returned the same thing.
On Tue, Apr 19, 2016 at 7:24 PM, Marc Schwartz wrote:
>> Jumping into this thread mainly on the point of the mode of the
>> distribution, while also supporting Bert's comments below on theory.
>>
>> If the vector 'x' that is being passed to this function is an integer
>> vector, then a tabulation of the integers can yield a 'mode', presuming of
>> course that there is only one unique mode. You may have to decide how you
>> want to handle a multi-modal discrete distribution.
>>
>> If the vector 'x' is continuous (e.g. contains floating point values),
>> then a tabulation is going to be problematic for a variety of reasons.
>>
>> In that case, prior discussions on this point, have yielded the following
>> estimation of the mode of a continuous distribution by using:
>>
>> Mode <- function(x) {
>>   D <- density(x)
>>   D\$x[which.max(D\$y)]
>> }
>>
>> where the second line of the function gets you the value of 'x' at the
>> maximum of the density estimate. Of course, there is still the possibility
>> of a multi-modal distribution and the nuances of which kernel is used, etc.,
>> etc.
>>
>> Food for thought.
>>
On Apr 19, 2016, at 7:07 PM, Bert Gunter wrote:
>> >
>> >
>> > Mode <- function(x) {
>> >     tabx <- table(x)
>> >     tabx[which.max(tabx)]
>> > }
>> >
>> > and use R's IQR function instead of yours.
>> >
>> > ... so I still don't get why you want to return a character string
>> > instead of a value for the IQR;
>> > and the mode of a sample defined as above is generally a bad estimator
>> > of the mode of the distribution. To say more than that would take me
>> > too far afield. Post on stats.stackexchange.com if you want to know
>> > why (if it's even relevant).
>> >
>> >
On Tue, Apr 19, 2016 at 4:25 PM, Michael Artz wrote:
>> > wrote:
>> >> Hi,
>> >>  Here is what I am doing
>> >>
>> >> notGroupedAll <- ddply(data
>> >>                 ,~groupColumn
>> >>                 ,summarise
>> >>                 ,col1_mean=mean(col1)
>> >>                 ,col2_mode=Mode(col2) #Function I wrote for getting the
>> >> mode shown below
>> >>                 ,col3_Range=myIqr(col3)
>> >>                 )
>> >>
>> >> groupedAll <- ddply(data
>> >>                 ,~groupColumn
>> >>                 ,summarise
>> >>                 ,col1_mean=mean(col1)
>> >>                 ,col2_mode=Mode(col2) #Function I wrote for getting the
>> >> mode shown below
>> >>                 ,col3_Range=Mode(col3)
>> >>                 )
>> >>
>> >> #custom Mode function
>> >> Mode <- function(x) {
>> >>  ux <- unique(x)
>> >>  ux[which.max(tabulate(match(x, ux)))]
>> >>
>> >> #the range function
>> >> myIqr <- function(x) {
>> >>  paste(round(quantile(x,0.375),0),round(quantile(x,0.625),0),sep="-")
>> >> }
>> >>
>> >>
>> >> }
>> >>
>> >>
>> >>
>> >>
On Tue, Apr 19, 2016 at 2:57 PM, William Dunlap wrote:
>> >>> <michaeleartz at gmail.com>
>> >>> wrote:
On Tue, Apr 19, 2016 at 12:31 PM, Bert Gunter wrote:
On Tue, Apr 19, 2016 at 10:30 AM, Bert Gunter wrote:
>> >>>>> <bgunter.4567 at gmail.com>
>> >>>>> wrote:
>> >>>>>> NO NO  -- I am wrong! The paste() expression is of course
>> >>>>>> evaluated.
>> >>>>>> It's just that a character string is returned of the form
>> >>>>>> "something -
>> >>>>>> something".
>> >>>>>>
>> >>>>>> I apologize for the confusion.
>> >>>>>>
>> >>>>>>
On Tue, Apr 19, 2016 at 10:25 AM, Bert Gunter wrote:
>> >>>>>> <bgunter.4567 at gmail.com>
>> >>>>>> wrote:
>> >>>>>>> To be precise:
>> >>>>>>>
>> >>>>>>> paste(round(quantile(x,0.25),0),round(quantile(x,0.75),0),sep="-")
>> >>>>>>>
>> >>>>>>> is an expression that evaluates to a character string:
>> >>>>>>> "round(quantile(x,.25),0) - round(quantile(x,0.75),0)"
>> >>>>>>>
>> >>>>>>> no matter what the argument of your function, x. Hence
>> >>>>>>>
>> >>>>>>> return(paste(...)) will return this exact character string and
>> >>>>>>> never
>> >>>>>>> evaluates x.
>> >>>>>>>
>> >>>>>>>
>> >>>>>>>
On Tue, Apr 19, 2016 at 8:34 AM, William Dunlap via R-help wrote:
>> >>>>>>> <r-help at r-project.org> wrote:
>> >>>>>>>>> That didn't work Jim!
>> >>>>>>>>
>> >>>>>>>> It always helps to say how the suggestion did not work.  Jim's
>> >>>>>>>> function had a typo in it - was that the problem?  Or did you not
>> >>>>>>>> change the call to ddply to use that function.  Here is something
>> >>>>>>>> that might "work" for you:
>> >>>>>>>>
>> >>>>>>>> library(plyr)
>> >>>>>>>>
>> >>>>>>>> data <- data.frame(groupColumn=rep(1:5,1:5), col1=2^(0:14))
>> >>>>>>>> myIqr <- function(x) {
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>> paste(round(quantile(x,0.25),0),round(quantile(x,0.75),0),sep="-")
>> >>>>>>>> }
>> >>>>>>>> ddply(data, ~groupColumn, summarise, col1_myIqr=myIqr(col1),
>> >>>>>>>> col1_IQR=stats::IQR(col1))
>> >>>>>>>> #  groupColumn col1_myIqr col1_IQR
>> >>>>>>>> #1           1        1-1        0
>> >>>>>>>> #2           2        2-4        1
>> >>>>>>>> #3           3      12-24       12
>> >>>>>>>> #4           4    112-320      208
>> >>>>>>>> #5           5  2048-8192     6144
>> >>>>>>>>
>> >>>>>>>> The important point is that
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>> paste(round(quantile(x,0.25),0),round(quantile(x,0.75),0),sep="-")
>> >>>>>>>> is not a function, it is an expression.   ddplyr wants functions.
>> >>>>>>>>
On Tue, Apr 19, 2016 at 7:56 AM, Michael Artz wrote:
>> >>>>>>>> <michaeleartz at gmail.com>
>> >>>>>>>> wrote:
>> >>>>>>>>
>> >>>>>>>>> That didn't work Jim!
>> >>>>>>>>>
>> >>>>>>>>>
On Mon, Apr 18, 2016 at 9:02 PM, Jim Lemon wrote:
>> >>>>>>>>> <drjimlemon at gmail.com>
>> >>>>>>>>> wrote:
>> >>>>>>>>>
>> >>>>>>>>>> Hi Michael,
>> >>>>>>>>>> At a guess, try this:
>> >>>>>>>>>>
>> >>>>>>>>>> iqr<-function(x) {
>> >>>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>> return(paste(round(quantile(x,0.25),0),round(quantile(x,0.75),0),sep="-")
>> >>>>>>>>>> }
On Tue, Apr 19, 2016 at 11:15 AM, Michael Artz wrote:
>> >>>>>>>>>> <michaeleartz at gmail.com>
>> >>>>>>>>>> wrote:
>> >>>>>>>>>>> Hi,
>> >>>>>>>>>>>  I am trying to show an interquartile range while grouping
>> >>>>>>>>>>> values
>> >>>>>>>>> using
>> >>>>>>>>>>> the function ddply().  So my function call now is like
>> >>>>>>>>>>>
>> >>>>>>>>>>> groupedAll <- ddply(data
>> >>>>>>>>>>>                 ,~groupColumn
>> >>>>>>>>>>>                 ,summarise
>> >>>>>>>>>>>                 ,col1_mean=mean(col1)
>> >>>>>>>>>>>                 ,col2_mode=Mode(col2) #Function I wrote for
>> >>>>>>>>>>> getting
>> >>>>>>>>> the
>> >>>>>>>>>>> mode shown below
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> ,col3_Range=paste(as.character(round(quantile(datat\$tenure,c(.25)))),
>> >>>>>>>>>>> as.character(round(quantile(data\$tenure,c(.75)))), sep = "-")
>> >>>>>>>>>>>                 )
>> >>>>>>>>>>>
>> >>>>>>>>>>> #custom Mode function
>> >>>>>>>>>>> Mode <- function(x) {
>> >>>>>>>>>>>  ux <- unique(x)
>> >>>>>>>>>>>  ux[which.max(tabulate(match(x, ux)))]
>> >>>>>>>>>>> }
>> >>>>>>>>>>>
>> >>>>>>>>>>> I am not sre what is going wrong on my interquartile range
>> >>>>>>>>>>> function, it
>> >>>>>>>>>>> works on its own outside of ddply()
>> >>>>>>>>>>>
