[R] Interquartile Range

Michael Artz michaeleartz at gmail.com
Wed Apr 20 01:25:22 CEST 2016


Hi,
  Here is what I am doing

notGroupedAll <- ddply(data
                 ,~groupColumn
                 ,summarise
                 ,col1_mean=mean(col1)
                 ,col2_mode=Mode(col2) #Function I wrote for getting the
mode shown below
                 ,col3_Range=myIqr(col3)
                 )

groupedAll <- ddply(data
                 ,~groupColumn
                 ,summarise
                 ,col1_mean=mean(col1)
                 ,col2_mode=Mode(col2) #Function I wrote for getting the
mode shown below
                 ,col3_Range=Mode(col3)
                 )

#custom Mode function
Mode <- function(x) {
  ux <- unique(x)
  ux[which.max(tabulate(match(x, ux)))]

#the range function
myIqr <- function(x) {
  paste(round(quantile(x,0.375),0),round(quantile(x,0.625),0),sep="-")
}


}


Here is what I am doing!! :)



On Tue, Apr 19, 2016 at 2:57 PM, William Dunlap <wdunlap at tibco.com> wrote:

> If you show us, not just tell us about, a self-contained example
> someone might show you a non-hacky way of getting the job done.
> (I don't see an argument to plyr::ddply called 'transform'.)
>
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com
>
> On Tue, Apr 19, 2016 at 12:18 PM, Michael Artz <michaeleartz at gmail.com>
> wrote:
>
>> Oh thanks for that clarification Bert!  Hope you enjoyed your coffee!  I
>> ended up just using the transform argument in the ddply function.  It
>> worked and it repeated, then I called a mode function in another call to
>> ddply that summarised.  Kinda hacky but oh well!
>>
>> On Tue, Apr 19, 2016 at 12:31 PM, Bert Gunter <bgunter.4567 at gmail.com>
>> wrote:
>>
>>> ... and I'm getting another cup of coffee...
>>>
>>> -- Bert
>>> Bert Gunter
>>>
>>> "The trouble with having an open mind is that people keep coming along
>>> and sticking things into it."
>>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>>>
>>>
>>> On Tue, Apr 19, 2016 at 10:30 AM, Bert Gunter <bgunter.4567 at gmail.com>
>>> wrote:
>>> > NO NO  -- I am wrong! The paste() expression is of course evaluated.
>>> > It's just that a character string is returned of the form "something -
>>> > something".
>>> >
>>> > I apologize for the confusion.
>>> >
>>> > -- Bert
>>> >
>>> >
>>> >
>>> >
>>> > Bert Gunter
>>> >
>>> > "The trouble with having an open mind is that people keep coming along
>>> > and sticking things into it."
>>> > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>>> >
>>> >
>>> > On Tue, Apr 19, 2016 at 10:25 AM, Bert Gunter <bgunter.4567 at gmail.com>
>>> wrote:
>>> >> To be precise:
>>> >>
>>> >> paste(round(quantile(x,0.25),0),round(quantile(x,0.75),0),sep="-")
>>> >>
>>> >> is an expression that evaluates to a character string:
>>> >> "round(quantile(x,.25),0) - round(quantile(x,0.75),0)"
>>> >>
>>> >> no matter what the argument of your function, x. Hence
>>> >>
>>> >> return(paste(...)) will return this exact character string and never
>>> >> evaluates x.
>>> >>
>>> >>
>>> >> Cheers,
>>> >> Bert
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >> Bert Gunter
>>> >>
>>> >> "The trouble with having an open mind is that people keep coming along
>>> >> and sticking things into it."
>>> >> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>>> >>
>>> >>
>>> >> On Tue, Apr 19, 2016 at 8:34 AM, William Dunlap via R-help
>>> >> <r-help at r-project.org> wrote:
>>> >>>> That didn't work Jim!
>>> >>>
>>> >>> It always helps to say how the suggestion did not work.  Jim's
>>> >>> function had a typo in it - was that the problem?  Or did you not
>>> >>> change the call to ddply to use that function.  Here is something
>>> >>> that might "work" for you:
>>> >>>
>>> >>>  library(plyr)
>>> >>>
>>> >>>  data <- data.frame(groupColumn=rep(1:5,1:5), col1=2^(0:14))
>>> >>>  myIqr <- function(x) {
>>> >>>
>>> paste(round(quantile(x,0.25),0),round(quantile(x,0.75),0),sep="-")
>>> >>>  }
>>> >>>  ddply(data, ~groupColumn, summarise, col1_myIqr=myIqr(col1),
>>> >>> col1_IQR=stats::IQR(col1))
>>> >>>  #  groupColumn col1_myIqr col1_IQR
>>> >>>  #1           1        1-1        0
>>> >>>  #2           2        2-4        1
>>> >>>  #3           3      12-24       12
>>> >>>  #4           4    112-320      208
>>> >>>  #5           5  2048-8192     6144
>>> >>>
>>> >>> The important point is that
>>> >>>
>>>  paste(round(quantile(x,0.25),0),round(quantile(x,0.75),0),sep="-")
>>> >>> is not a function, it is an expression.   ddplyr wants functions.
>>> >>>
>>> >>>
>>> >>> Bill Dunlap
>>> >>> TIBCO Software
>>> >>> wdunlap tibco.com
>>> >>>
>>> >>> On Tue, Apr 19, 2016 at 7:56 AM, Michael Artz <
>>> michaeleartz at gmail.com>
>>> >>> wrote:
>>> >>>
>>> >>>> That didn't work Jim!
>>> >>>>
>>> >>>> Thanks anyway
>>> >>>>
>>> >>>> On Mon, Apr 18, 2016 at 9:02 PM, Jim Lemon <drjimlemon at gmail.com>
>>> wrote:
>>> >>>>
>>> >>>> > Hi Michael,
>>> >>>> > At a guess, try this:
>>> >>>> >
>>> >>>> > iqr<-function(x) {
>>> >>>> >
>>> >>>>
>>> return(paste(round(quantile(x,0.25),0),round(quantile(x,0.75),0),sep="-")
>>> >>>> > }
>>> >>>> >
>>> >>>> > .col3_Range=iqr(datat$tenure)
>>> >>>> >
>>> >>>> > Jim
>>> >>>> >
>>> >>>> >
>>> >>>> >
>>> >>>> > On Tue, Apr 19, 2016 at 11:15 AM, Michael Artz <
>>> michaeleartz at gmail.com>
>>> >>>> > wrote:
>>> >>>> > > Hi,
>>> >>>> > >   I am trying to show an interquartile range while grouping
>>> values
>>> >>>> using
>>> >>>> > > the function ddply().  So my function call now is like
>>> >>>> > >
>>> >>>> > > groupedAll <- ddply(data
>>> >>>> > >                  ,~groupColumn
>>> >>>> > >                  ,summarise
>>> >>>> > >                  ,col1_mean=mean(col1)
>>> >>>> > >                  ,col2_mode=Mode(col2) #Function I wrote for
>>> getting
>>> >>>> the
>>> >>>> > > mode shown below
>>> >>>> > >
>>> >>>> > >
>>> ,col3_Range=paste(as.character(round(quantile(datat$tenure,c(.25)))),
>>> >>>> > > as.character(round(quantile(data$tenure,c(.75)))), sep = "-")
>>> >>>> > >                  )
>>> >>>> > >
>>> >>>> > > #custom Mode function
>>> >>>> > > Mode <- function(x) {
>>> >>>> > >   ux <- unique(x)
>>> >>>> > >   ux[which.max(tabulate(match(x, ux)))]
>>> >>>> > > }
>>> >>>> > >
>>> >>>> > > I am not sre what is going wrong on my interquartile range
>>> function, it
>>> >>>> > > works on its own outside of ddply()
>>> >>>> > >
>>> >>>> > >         [[alternative HTML version deleted]]
>>> >>>> > >
>>> >>>> > > ______________________________________________
>>> >>>> > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more,
>>> see
>>> >>>> > > https://stat.ethz.ch/mailman/listinfo/r-help
>>> >>>> > > PLEASE do read the posting guide
>>> >>>> > http://www.R-project.org/posting-guide.html
>>> >>>> > > and provide commented, minimal, self-contained, reproducible
>>> code.
>>> >>>> >
>>> >>>>
>>> >>>>         [[alternative HTML version deleted]]
>>> >>>>
>>> >>>> ______________________________________________
>>> >>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> >>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> >>>> PLEASE do read the posting guide
>>> >>>> http://www.R-project.org/posting-guide.html
>>> >>>> and provide commented, minimal, self-contained, reproducible code.
>>> >>>>
>>> >>>
>>> >>>         [[alternative HTML version deleted]]
>>> >>>
>>> >>> ______________________________________________
>>> >>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> >>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> >>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> >>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>>
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list