Michael Artz
michaeleartz at gmail.com
Wed Apr 20 06:39:53 CEST 2016
I already found a solution, you suggested I try to find a non hacky
solution, which was not really my priority. I should have declined
politely, which I will do now. Or, ifyou just want me to post reproducible
code because you are bored or because you like solving problems then let me
know and I will accommodate. You have been helpful and I wouldnt mind in
that case. Also, IQR was not a help from the beginning. If it supplies one
value, then its not even a candidate to be helpful for my problem. I
already talked about the format i was looking for. I dont think I violated
any posting guideline, I asked for help, and people pointed me in a
direction and it helped me. Thanks again, I appreciate it.
On Apr 19, 2016 10:53 PM, "Bert Gunter" <bgunter.4567 at gmail.com> wrote:
> ???
>
> IQR returns a single number.
>
> > IQR(rnorm(10))
> [1] 1.090168
>
> To your 2nd response:
> "I could have used average, min, max, they all would have returned the
> same thing., "
>
> I can only respond: huh?? Are all your values identical?
>
> You really need to provide a small reproducible example as requested
> by the posting guide -- I certainly don't get it, and I'm done
> guessing. Maybe others will see what I am missing and say something
> useful. I clearly can't.
>
> Cheers,
> Bert
>
>
>
>
>
On Tue, Apr 19, 2016 at 5:29 PM, Michael Artz <michaeleartz at gmail.com>
wrote:
> wrote:
> > Again, IQR returns two both a .25 and a .75 value and it failed, which is
> > why I didn't use it before. Also, the first function just returns tha
> same
> > value repeating. Since they are the same, before the second call, using
> the
> > mode function is just a way to grab one value. I could have used average,
> > min, max, they all would have returned the same thing.
> >
> > Mike
> >
On Tue, Apr 19, 2016 at 7:24 PM, Marc Schwartz <marc_schwartz at me.com>
wrote:
> wrote:
> >>
> >> Hi,
> >>
> >> Jumping into this thread mainly on the point of the mode of the
> >> distribution, while also supporting Bert's comments below on theory.
> >>
> >> If the vector 'x' that is being passed to this function is an integer
> >> vector, then a tabulation of the integers can yield a 'mode', presuming
> of
> >> course that there is only one unique mode. You may have to decide how
> you
> >> want to handle a multi-modal discrete distribution.
> >>
> >> If the vector 'x' is continuous (e.g. contains floating point values),
> >> then a tabulation is going to be problematic for a variety of reasons.
> >>
> >> In that case, prior discussions on this point, have yielded the
> following
> >> estimation of the mode of a continuous distribution by using:
> >>
> >> Mode <- function(x) {
> >> D <- density(x)
> >> D$x[which.max(D$y)]
> >> }
> >>
> >> where the second line of the function gets you the value of 'x' at the
> >> maximum of the density estimate. Of course, there is still the
> possibility
> >> of a multi-modal distribution and the nuances of which kernel is used,
> etc.,
> >> etc.
> >>
> >> Food for thought.
> >>
> On Apr 19, 2016, at 7:07 PM, Bert Gunter <bgunter.4567 at gmail.com>
wrote:
> wrote:
> >> >
> >> > Well, instead of your functions try:
> >> >
> >> > Mode <- function(x) {
> >> > tabx <- table(x)
> >> > tabx[which.max(tabx)]
> >> > }
> >> >
> >> > and use R's IQR function instead of yours.
> >> >
> >> > ... so I still don't get why you want to return a character string
> >> > instead of a value for the IQR;
> >> > and the mode of a sample defined as above is generally a bad estimator
> >> > of the mode of the distribution. To say more than that would take me
> >> > too far afield. Post on stats.stackexchange.com if you want to know
> >> > why (if it's even relevant).
> >> >
> >> > wrote:
> >
> >> > wrote:
> >> >> Hi,
> >> >> Here is what I am doing
> >> >>
> >> >> notGroupedAll <- ddply(data
> >> >> ,~groupColumn
> >> >> ,summarise
> >> >> ,col1_mean=mean(col1)
> >> >> ,col2_mode=Mode(col2) #Function I wrote for getting
> the
> >> >> mode shown below
> >> >> ,col3_Range=myIqr(col3)
> >> >> )
> >> >>
> >> >> groupedAll <- ddply(data
> >> >> ,~groupColumn
> >> >> ,summarise
> >> >> ,col1_mean=mean(col1)
> >> >> ,col2_mode=Mode(col2) #Function I wrote for getting
> the
> >> >> mode shown below
> >> >> ,col3_Range=Mode(col3)
> >> >> )
> >> >>
> >> >> #custom Mode function
> >> >> Mode <- function(x) {
> >> >> ux <- unique(x)
> >> >> ux[which.max(tabulate(match(x, ux)))]
> >> >>
> >> >> #the range function
> >> >> myIqr <- function(x) {
> >> >> paste(round(quantile(x,0.375),0),round(quantile(x,0.625),0),sep="-")
> >> >> }
> >> >>
> >> >>
> >> >> }
> >> >>
> >> >>
> >> >> Here is what I am doing!! :)
> >> >>
> >> >>
> >> >>
On Tue, Apr 19, 2016 at 2:57 PM, William Dunlap <wdunlap at tibco.com>
wrote:
> >> >> wrote:
> >> >>>
> >> >>> If you show us, not just tell us about, a self-contained example
> >> >>> someone might show you a non-hacky way of getting the job done.
> >> >>> (I don't see an argument to plyr::ddply called 'transform'.)
> >> >>>
> >> >>> Bill Dunlap
> >> >>> TIBCO Software
> >> >>> wdunlap tibco.com
> >> >>>
On Tue, Apr 19, 2016 at 12:18 PM, Michael Artz
<michaeleartz at gmail.com>
wrote:
> >> >>> <michaeleartz at gmail.com>
> >> >>> wrote:
> >> >>>>
> >> >>>> Oh thanks for that clarification Bert! Hope you enjoyed your
> coffee!
> >> >>>> I
> >> >>>> ended up just using the transform argument in the ddply function.
> It
> >> >>>> worked
> >> >>>> and it repeated, then I called a mode function in another call to
> >> >>>> ddply that
> >> >>>> summarised. Kinda hacky but oh well!
> >> >>>>
On Tue, Apr 19, 2016 at 12:31 PM, Bert Gunter
<bgunter.4567 at gmail.com>
wrote:
> >> >>>> <bgunter.4567 at gmail.com>
> >> >>>> wrote:
> >> >>>>>
On Tue, Apr 19, 2016 at 10:30 AM, Bert Gunter
<bgunter.4567 at gmail.com>
wrote:
> >> >>>>> <bgunter.4567 at gmail.com>
> >> >>>>> wrote:
> >> >>>>>> NO NO -- I am wrong! The paste() expression is of course
> >> >>>>>> evaluated.
> >> >>>>>> It's just that a character string is returned of the form
> >> >>>>>> "something -
> >> >>>>>> something".
> >> >>>>>>
> >> >>>>>> I apologize for the confusion.
> >> >>>>>>
On Tue, Apr 19, 2016 at 10:25 AM, Bert Gunter
<bgunter.4567 at gmail.com>
wrote:
> >> >>>>>> <bgunter.4567 at gmail.com>
> >> >>>>>> wrote:
> >> >>>>>>> To be precise:
> >> >>>>>>>
> >> >>>>>>>
> paste(round(quantile(x,0.25),0),round(quantile(x,0.75),0),sep="-")
> >> >>>>>>>
> >> >>>>>>> is an expression that evaluates to a character string:
> >> >>>>>>> "round(quantile(x,.25),0) - round(quantile(x,0.75),0)"
> >> >>>>>>>
> >> >>>>>>> no matter what the argument of your function, x. Hence
> >> >>>>>>>
> >> >>>>>>> return(paste(...)) will return this exact character string and
> >> >>>>>>> never
> >> >>>>>>> evaluates x.
> >> >>>>>>>
> >> >>>>>>>
On Tue, Apr 19, 2016 at 8:34 AM, William Dunlap via R-help
<r-help at r-project.org> wrote:
> >> >>>>>>> <r-help at r-project.org> wrote:
> >> >>>>>>>>> That didn't work Jim!
> >> >>>>>>>>
> >> >>>>>>>> It always helps to say how the suggestion did not work. Jim's
> >> >>>>>>>> function had a typo in it - was that the problem? Or did you
> not
> >> >>>>>>>> change the call to ddply to use that function. Here is
> something
> >> >>>>>>>> that might "work" for you:
> >> >>>>>>>>
> >> >>>>>>>> library(plyr)
> >> >>>>>>>>
> >> >>>>>>>> data <- data.frame(groupColumn=rep(1:5,1:5), col1=2^(0:14))
> >> >>>>>>>> myIqr <- function(x) {
> >> >>>>>>>>
> >> >>>>>>>>
> >> >>>>>>>>
> paste(round(quantile(x,0.25),0),round(quantile(x,0.75),0),sep="-")
> >> >>>>>>>> }
> >> >>>>>>>> ddply(data, ~groupColumn, summarise, col1_myIqr=myIqr(col1),
> >> >>>>>>>> col1_IQR=stats::IQR(col1))
> >> >>>>>>>> # groupColumn col1_myIqr col1_IQR
> >> >>>>>>>> #1 1 1-1 0
> >> >>>>>>>> #2 2 2-4 1
> >> >>>>>>>> #3 3 12-24 12
> >> >>>>>>>> #4 4 112-320 208
> >> >>>>>>>> #5 5 2048-8192 6144
> >> >>>>>>>>
> >> >>>>>>>> The important point is that
> >> >>>>>>>>
> >> >>>>>>>>
> >> >>>>>>>>
> paste(round(quantile(x,0.25),0),round(quantile(x,0.75),0),sep="-")
> >> >>>>>>>> is not a function, it is an expression. ddplyr wants
> functions.
> >> >>>>>>>>
> >> >>>>>>>>
On Tue, Apr 19, 2016 at 7:56 AM, Michael Artz
<michaeleartz at gmail.com>
wrote:
> >> >>>>>>>> <michaeleartz at gmail.com>
> >> >>>>>>>> wrote:
> >> >>>>>>>>
> >> >>>>>>>>> That didn't work Jim!
> >> >>>>>>>>>
> >> >>>>>>>>> Thanks anyway
> >> >>>>>>>>>
On Mon, Apr 18, 2016 at 9:02 PM, Jim Lemon
<drjimlemon at gmail.com>
wrote:
> >> >>>>>>>>> <drjimlemon at gmail.com>
> >> >>>>>>>>> wrote:
> >> >>>>>>>>>
> >> >>>>>>>>>> Hi Michael,
> >> >>>>>>>>>> At a guess, try this:
> >> >>>>>>>>>>
> >> >>>>>>>>>> iqr<-function(x) {
> >> >>>>>>>>>>
> >> >>>>>>>>>
> >> >>>>>>>>>
> >> >>>>>>>>>
> return(paste(round(quantile(x,0.25),0),round(quantile(x,0.75),0),sep="-")
> >> >>>>>>>>>> }
> >> >>>>>>>>>>
> >> >>>>>>>>>> .col3_Range=iqr(datat$tenure)
> >> >>>>>>>>>>
> >> >>>>>>>>>>
On Tue, Apr 19, 2016 at 11:15 AM, Michael Artz
<michaeleartz at gmail.com>
wrote:
> >> >>>>>>>>>> <michaeleartz at gmail.com>
> >> >>>>>>>>>> wrote:
> >> >>>>>>>>>>> Hi,
> >> >>>>>>>>>>> I am trying to show an interquartile range while grouping
> >> >>>>>>>>>>> values
> >> >>>>>>>>> using
> >> >>>>>>>>>>> the function ddply(). So my function call now is like
> >> >>>>>>>>>>>
> >> >>>>>>>>>>> groupedAll <- ddply(data
> >> >>>>>>>>>>> ,~groupColumn
> >> >>>>>>>>>>> ,summarise
> >> >>>>>>>>>>> ,col1_mean=mean(col1)
> >> >>>>>>>>>>> ,col2_mode=Mode(col2) #Function I wrote for
> >> >>>>>>>>>>> getting
> >> >>>>>>>>> the
> >> >>>>>>>>>>> mode shown below
> >> >>>>>>>>>>>
> >> >>>>>>>>>>>
> >> >>>>>>>>>>>
> >> >>>>>>>>>>>
> ,col3_Range=paste(as.character(round(quantile(datat$tenure,c(.25)))),
> >> >>>>>>>>>>> as.character(round(quantile(data$tenure,c(.75)))), sep =
> "-")
> >> >>>>>>>>>>> )
> >> >>>>>>>>>>>
> >> >>>>>>>>>>> #custom Mode function
> >> >>>>>>>>>>> Mode <- function(x) {
> >> >>>>>>>>>>> ux <- unique(x)
> >> >>>>>>>>>>> ux[which.max(tabulate(match(x, ux)))]
> >> >>>>>>>>>>> }
> >> >>>>>>>>>>>
> >> >>>>>>>>>>> I am not sre what is going wrong on my interquartile range
> >> >>>>>>>>>>> function, it
> >> >>>>>>>>>>> works on its own outside of ddply()
> >> >>>>>>>>>>>
> >> >>>>>>>>>>
> >> >>>>>>>>>
> >> >>>>>>>>
> >> >>>>
> >> >>>>
> >> >>>
> >> >>
> >> >
> >>
> >
>
