[R] Integer division

Tue Dec 20 17:57:55 CET 2022

Documentation specifics aside, and I am not convinced that is an issue here, there is a responsibility on programmers on how to use routines like this by testing small samples and seeing if the results match expectations.

Since negative numbers were possible, that would have been part of such tests.

And there are many ways to do things and the method chosen does not strike me as a particularly great method of finding out about the first digit unless you are guaranteed to have exactly five digits. It may be efficient but it can likely fail in many cases where the data is not as expected such as more than five digits or not containing a number.

So many programmers would first filter the data to check for various conditions. Some might simply try to convert the numbers into character strings (perhaps the absolute value of the number instead) and look at the first character instead, or handle it differently if it is a minus sign. 

Many programming languages contain families of functions for some tasks when there are many possible ways to do something that can get somewhat different results. There is absolutely NO reason to assume that any one member of a family of functions will do what you expect and you may need to either explore others that are similar or make your own.

Something as simple as this might give you what you want:

first_of_five <- function(numb) abs(numb) %/% 10000

-----Original Message-----
From: R-help <r-help-bounces using r-project.org> On Behalf Of Göran Broström
Sent: Tuesday, December 20, 2022 1:53 AM
To: Richard O'Keefe <raoknz using gmail.com>
Cc: r-help using r-project.org
Subject: Re: [R] Integer division

Thanks Richard,

the "rounding claim" was my mistake (as I replied to Martin), I should said "truncates toward zero" as you explain.

However, my point was that these two mathematical functions should be defined in the documentation, as you also say. And I was surprised that there is no consensus regarding the definition of such elementary functions.

Göran

On 2022-12-20 03:01, Richard O'Keefe wrote:
> The Fortran '08 standard says <<
> One operand of type integer may be divided by another operand of type 
> integer. Although the mathematical quotient of two integers is not 
> necessarily an integer, Table 7.2 specifies that an expression 
> involving the division operator with two operands of type integer is 
> interpreted as an expression of type integer. The result of such an 
> operation is the integer closest to the mathematical quotient and 
> between zero and the mathematical quotient inclusively. >> Another way 
> to say this is that integer division in Fortran TRUNCATES towards 
> zero.  It does not round and never has.
> 
> C carefully left the behaviour of integer division (/) unspecified, 
> but introduced the div(,) function with the same effect as Fortran 
> (/).  Later versions of the C standard tightened this up, and the C17 
> standard reads << The result of the / operator is the quotient from 
> the division of the first operand by the second; the result of the % 
> operator is the remainder. In both operations, if the value of the 
> second operand is zero, the behavior is undefined.
> When integers are divided, the result of the / operator is the 
> algebraic quotient with any fractional part discarded. 107) If the 
> quotient a/b is representable, the expression (a/b)*b + a%b shall 
> equal a ; otherwise, the behavior of both a/b and a%b is undefined.>>
> 
> That is, C17 TRUNCATES the result of division towards zero.  I don't 
> know of any C compiler that rounds, certainly gcc does not.
> 
> 
> The Java 15 Language Specification says << Integer division rounds 
> toward 0. >> which also specified truncating division.
> 
> 
> The help for ?"%/%" does not say what the result is.
> Or if it does, I can't find it.  Either way, this is a defect in the 
> documentation.  It needs to be spelled out very clearly.
> R version 4.2.2 Patched (2022-11-10 r83330) -- "Innocent and Trusting"
>  > c(-8,8) %/% 3
> [1] -3  2
> so we deduce that R *neither* rounds *not* truncates, but returns the 
> floor of the quotient.
> It is widely argued that flooring division is more generally useful 
> than rounding or truncating division, but it is admittedly surprising.
> 
> On Tue, 20 Dec 2022 at 02:51, Göran Broström <gb using ehar.se 
> <mailto:gb using ehar.se>> wrote:
> 
>     I have a long vector x with five-digit codes where the first digit of
>     each is of special interest, so I extracted them through
> 
>       > y <- x %/% 10000
> 
>     but to my surprise y contained the value -1 in some places. It turned
>     out that x contains -1 as a symbol for 'missing value' so in effect I
>     found that
> 
>       > -1 %/% 10000 == -1
> 
>     Had to check the help page for "%/%", and the first relevant comment I
>     found was:
> 
>     "Users are sometimes surprised by the value returned".
> 
>     No surprise there. Further down:
> 
>     ‘%%’ indicates ‘x mod y’ (“x modulo y”) and ‘%/%’ indicates
>            integer division.  It is guaranteed that
> 
>            ‘ x == (x %% y) + y * (x %/% y) ’ (up to rounding error)
> 
>     I did expect  (a %/% b) to return round(a / b), like gfortran and gcc,
>     but instead I get floor(a / b) in R. What is the reason for these
>     different definitions? And shouldn't R's definition be documented?
> 
>     Thanks, Göran
> 
>     ______________________________________________
>     R-help using r-project.org <mailto:R-help using r-project.org> mailing list --
>     To UNSUBSCRIBE and more, see
>     https://stat.ethz.ch/mailman/listinfo/r-help
>     <https://stat.ethz.ch/mailman/listinfo/r-help>
>     PLEASE do read the posting guide
>     http://www.R-project.org/posting-guide.html
>     <http://www.R-project.org/posting-guide.html>
>     and provide commented, minimal, self-contained, reproducible code.
>

______________________________________________
R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.