[R] odd behaviour of identical

Wacek Kusnierczyk Waclaw.Marcin.Kusnierczyk at idi.ntnu.no
Sat Nov 1 22:57:38 CET 2008


Patrick Burns wrote:
> Wacek Kusnierczyk wrote:
>> smells bad design.
>>   
>
> Nonsense.

not really, i'm afraid.

> One of the key design features of R is that it
> hides implementation details from users.  They
> are free to think about the substantive issues with
> their data rather than worrying about computational
> trivia.
oops.  i wish you were right; unfortunately, you seem not to be.  this
might have been the idea, and indeed most often one does not need to
think in terms of the underlying representation, but i have just given
examples of how this abstraction barrier is broken, and these are not
the only ones.  for the sake of completeness, i'll try to make the point
clear once again below.

on the side, when i once complained here that is.integer(1) evaluates to
FALSE, the (unnecessarily arogant, and not to the point) response was
that one has to understand how computations are actually done in terms
of the underlying representations.  irrespectively of whether i do or
not understand that, demanding that a user does is precisely
contradicting the 'key design feature' you mention above.

if you read jim's response below carefully enough, you'll find another
trace of that your 'nonsense' is nonsense.

to the point:

is.integer(1) # FALSE
is.integer(1:1) # TRUE

is not particularly appealing as a design, though it can be defended
along the line that : uses 1 as the increase step, thus if it starts at
an integer, the vector contains integers (just like it says in
help(":")).  the problem here is that 1:1 is *obviously* an integer
vector, while 1 is *obviously* a number vector, but *obviously* not an
integer vector.  do a poll on how obvious it is to r's users, i'd bet
you lose.

with an abstraction barrier in place, is.integer should mean 'is the
number an integer?', and not 'is the number stored in an integer type?'

with an abstraction barrier in place, identical(1:2, c(1,2)) would
evaluate to TRUE. 

with consistent typing, sqrt(as.integer(2)) would do the job and
sqrt(-2) would do as well (as in, e.g., sage), OR both would result in
an error (as in a language without implicit typecasting, e.g., f#).

etc. etc. 


furthermore,

is.integer(1.0:1.0) # TRUE
is.integer(1.0:1.5) # TRUE

is not particularly appealing as a design, and the second form does not
even stand with what help(":") says (please do read it).  one of the
responses made here to the complain that is.integer(1) is not TRUE was
that you need to explicitly say you want 1 to be represented as an
integer (note, 'integer' here means the underlying representation, not
what a user assuming an abstraction barrier would normally think
'integer' means).  now, 1.0 would be *explicitly* saying 'not integer',
no?  then 1.0:1.0 (and 1.0:1.1, etc.) perform an explicit downcast (from
double to int), which is a rather bad idea.  and it's not an
implementational detail, it's the design.

>
> There may have been some, but I don't recall a
> time when I've ever needed to know if a vector in R
> was integer or double.
>

well, if it doesn't really matter, make is.integer(1) evaluate to TRUE. 
as the previous communication should convince you, there are people that
get confused by the opposite result.  they're right, and the response
should not be that they need to learn the implementational details.

i hope the next time you say 'nonsense', you'll first make an effort to
distinguish dreams from reality.  it is true that "one of the key design
features of R *should be* that it hides implementation details from
users.  they *should be* free to think about the substantive issues with
their data rather than worrying about computational trivia."  but
instead there is a mess which forces the users to learn the details to
understand the odd behavoiur of their programs.

vQ



>
> Patrick Burns
> patrick at burns-stat.com
> +44 (0)20 8525 0696
> http://www.burns-stat.com
> (home of S Poetry and "A Guide for the Unwilling S User")
>> jim holtman wrote:
>>  
>>> If you want them to be identical, then you have to explicitly assign
>>> an integer to the vector so that conversion is not done:
>>>
>>>      
>>>> x = 1:10
>>>> y = 1:10
>>>>
>>>> all.equal(x,y)
>>>>           
>>> [1] TRUE
>>>      
>>>> identical(x,y)
>>>>           
>>> [1] TRUE
>>>      
>>>> y[11] = 11L
>>>> y = y[1:10]
>>>>
>>>> all.equal(x,y)
>>>>           
>>> [1] TRUE
>>>      
>>>> identical(x,y)
>>>>           
>>> [1] TRUE
>>>  
>>>
>>> On Sun, Oct 26, 2008 at 6:39 PM, Wacek Kusnierczyk
>>> <Waclaw.Marcin.Kusnierczyk at idi.ntnu.no> wrote:
>>>      
>>>> given what ?identical says, i find the following odd:
>>>>
>>>> x = 1:10
>>>> y = 1:10
>>>>
>>>> all.equal(x,y)
>>>> [1] TRUE
>>>> identical(x,y)
>>>> [1] TRUE
>>>>
>>>> y[11] = 11
>>>> y = y[1:10]
>>>>
>>>> all.equal(x,y)
>>>> [1] TRUE
>>>> identical(x,y)
>>>> [1] FALSE
>>>>
>>>> y
>>>> [1] 1 2 3 4 5 6 7 8 9 10
>>>> length(y)
>>>> [1] 10
>>>>
>>>>
>>>> looks like a bug.
>>>>
>>>> platform       i686-pc-linux-gnu
>>>> arch           i686
>>>> os             linux-gnu
>>>> system         i686, linux-gnu
>>>> status
>>>> major          2
>>>> minor          7.0
>>>> year           2008
>>>> month          04
>>>> day            22
>>>> svn rev        45424
>>>> language       R
>>>> version.string R version 2.7.0 (2008-04-22)
>>>>
>>>>
>>>> vQ
>>>>
>>>> -- 
>>>> -------------------------------------------------------------------------------
>>>>
>>>> Wacek Kusnierczyk, MD PhD
>>>>
>>>> Email: waku at idi.ntnu.no
>>>> Phone: +47 73591875, +47 72574609
>>>>
>>>> Department of Computer and Information Science (IDI)
>>>> Faculty of Information Technology, Mathematics and Electrical
>>>> Engineering (IME)
>>>> Norwegian University of Science and Technology (NTNU)
>>>> Sem Saelands vei 7, 7491 Trondheim, Norway
>>>> Room itv303
>>>>
>>>> Bioinformatics & Gene Regulation Group
>>>> Department of Cancer Research and Molecular Medicine (IKM)
>>>> Faculty of Medicine (DMF)
>>>> Norwegian University of Science and Technology (NTNU)
>>>> Laboratory Center, Erling Skjalgsons gt. 1, 7030 Trondheim, Norway
>>>> Room 231.05.060
>>>>
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>
>>>>           
>>>
>>>       
>>
>>
>>   


-- 
-------------------------------------------------------------------------------
Wacek Kusnierczyk, MD PhD

Email: waku at idi.ntnu.no
Phone: +47 73591875, +47 72574609

Department of Computer and Information Science (IDI)
Faculty of Information Technology, Mathematics and Electrical Engineering (IME)
Norwegian University of Science and Technology (NTNU)
Sem Saelands vei 7, 7491 Trondheim, Norway
Room itv303

Bioinformatics & Gene Regulation Group
Department of Cancer Research and Molecular Medicine (IKM)
Faculty of Medicine (DMF)
Norwegian University of Science and Technology (NTNU)
Laboratory Center, Erling Skjalgsons gt. 1, 7030 Trondheim, Norway
Room 231.05.060



More information about the R-help mailing list