[R] Testing for Inequality à la "select case"

Mon Mar 16 21:00:44 CET 2009

On Sun, Mar 15, 2009 at 11:46 PM, diegol <diegol81 at gmail.com> wrote:
> ...Steve, ...

Actually Stavros (ΣΤΑΥΡΟΣ), not Stephen/Steve (ΣΤΕΦΑΝΟΣ).  Both Greek,
but different names.

> I still don't understand the analogy. I agree that in this case the R
> approach is vectorized. However, your function just as you first proposed it
> will not work without a loop.

Approach is vectorized over the range parameter, but not vectorized
over the x parameter.  If you want to vectorize over x, you can use
findInterval:

mr <-
 local({
   # Local constants
   range= c(0,20,100,250,700,1000,Inf)*1000
   perc = c(65,40,30,25,20,0)/100
   min =  c(0,14,40,75,175,250)*1000

   function(x)
     { idx <- findInterval(x,range)
       pmax( x*perc[idx], min[idx] )
     }
 })

And this time, you *do* need pmax.  I did refer to cut/split, but only
to say they were unnecessary.

          -s

>> max and pmax are equivalent in this case.  I just use pmax as my
>> default because it acts like other arithmetic operators (+, *, etc.)
>> which perform pointwise (element-by-element) operations.
>
> It's true. I changed it because I had applied your original version of mr()
> to the entire vector x, which gave an incorrect result (perhaps "range" was
> recycled in "idx <- which(x<=range)[1]"). If I used max instead of pmax, and
> ever happened to use mr() without a loop, the length of the result would be
> strange enough for me to realise the error. But then again, I added the "if
> (length(x) >1) stop("x must have length 1")" line, so using max or pmax now
> doesn't really make a difference, apart perhaps from run time.
>
>> Using cut/split seems like gross overkill here.  Among other things,
>> you don't need to generate labels for all the different ranges.
>>
>>  which(x<=range)[1]
>> seems straightforward enough to me,
>
> I could edit the mr_2() function a little bit to make it more efficient. I
> left it mostly unchanged for the thread to be easier to follow. For example
> I could replace the last four lines for only:
>
>    product <- x*percent
>    ifelse(product< minimum, minimum, product)
>
> But I believe you refer to the cut/split functions rather. I agree that
> "which(x<=range)[1]" is straighforward, but using such expression will
> require a loop to pull the trick, which I don't intend. Am I missing
> something?
>
>
> Regards,
> Diego
>
>
>
> Stavros Macrakis-2 wrote:
>>
>> Using cut/split seems like gross overkill here.  Among other things,
>> you don't need to generate labels for all the different ranges.
>>
>>    which(x<=range)[1]
>>
>> seems straightforward enough to me, but you could also use the
>> built-in function findInterval.
>>
>>               -s
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>
>
> -----
> ~~~~~~~~~~~~~~~~~~~~~~~~~~
> Diego Mazzeo
> Actuarial Science Student
> Facultad de Ciencias Económicas
> Universidad de Buenos Aires
> Buenos Aires, Argentina
> --
> View this message in context: http://www.nabble.com/Testing-for-Inequality-%C3%A0-la-%22select-case%22-tp22527465p22531513.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>