[R] Integer vs numeric

cgenolin at u-paris10.fr cgenolin at u-paris10.fr
Wed Jan 30 15:28:28 CET 2008


Ok, I get your point.

On the other hand, R is not only for high level programmer. On low 
level, the fact that ":" change the type is strange. Is it not possible 
to define two operator ? A "::" that will be use only for indexing and 
that will be integer (for efficiency) and a ":" that will be use as a 
short cut for number generation and that will be numeric ?

Christophe

> On Jan 29, 2008 10:40 PM, Christophe Genolini <cgenolin at u-paris10.fr> wrote:
>> x[c(2,4)] work as well
>
> My point is that that at the native-code level subsetting/enumeration
> is done by integer indices and coercion from double to integer is
> always going to less efficient than working directly with integers.
> That's likely to be one of the rationales for 1:n being integers (in
> addition to being smaller in size).
>
> Also, the coercion as.integer(xs) where xs is a vector of doubles will
> all in all take up *three* times the memory compared with
> object.size(xs) and just add extra work to the garbage collector.
>
> Finally, working with doubles is not precision safe (there are many
> threads with various flavors on the same topic).  Example:
>
>> xs <- seq(1,1.2,by=0.01);
>> print(xs);
> [1] 1.00 1.01 1.02 1.03 1.04 1.05 1.06
> [8] 1.07 1.08 1.09 1.10 1.11 1.12 1.13
> [15] 1.14 1.15 1.16 1.17 1.18 1.19 1.20
>
>> ys <- as.integer(100*xs);
>> print(ys);
> [1] 100 101 102 103 104 105 106 107 108
> [10] 109 110 111 112 112 114 114 115 117
> [19] 118 119 120 121 122 123 124 125 126
> [28] 127 128 129 130
>
> Pay attention to elements 13:18(!) - subsetting using doubles is not safe.
>
> /H
>
>>
>> Henrik Bengtsson a écrit :
>>
>> > x[1:n]
>> >
>> > /H
>> >
>> > On Jan 29, 2008 5:07 AM,  <cgenolin at u-paris10.fr> wrote:
>> >
>> >> Seems strange to me to define an operator relatively to a very 
>> special case.
>> >> I have to admit that I do not use 1:1e7 every day :-)
>> >>
>> >> Wouldn't it be more appropriate to define a a:b operator numeric (that
>> >> is preserving the initial class of a and b) and in specific case that
>> >> need optimization, changing the type?
>> >>
>> >> for i in as.integer(1:1e7)
>> >>
>> >> That might appears as a minor point, but when using S4, for what I
>> >> know, if you define a class that can take either 1:3 or c(1,3,4), one
>> >> is integer, the other numeric, one of those will not be accepted by the
>> >> class...
>> >>
>> >> Christophe
>> >>
>> >>
>> >>
>> >>
>> >>> On 28-Jan-08 22:40:02, Peter Dalgaard wrote:
>> >>>
>> >>>> [...]
>> >>>> AFAIR, space is/was more of an issue. If you do something like
>> >>>>
>> >>>> for i in 1:1e7
>> >>>>     some.silly.simulation()
>> >>>>
>> >>>> then you have 40 MB sitting there doing nothing, and 80 MB if
>> >>>> it had been floating point.
>> >>>>
>> >>> Hmmm ... there's something to be said for good old
>> >>>
>> >>>  for(i=1,i<=1e7,i++){....}
>> >>>
>> >>> As pointed out in ?"for", when you do
>> >>>
>> >>>  for(i in X){...}  #(e.g. X=(1:1e7))
>> >>>
>> >>> the object X is created (or is already there) in full
>> >>> at the start and sits there, as you say doing nothing,
>> >>> until you end the loop. Whereas the C code just keeps
>> >>> track of i and of the condition.
>> >>>
>> >>> At least on a couple of my machines (64MB and 184MB RAM)
>> >>> knocking out 40MB would inflict severe trauma! Let alone 80MB.
>> >>> Mind you, the little one is no longer allowed to play with
>> >>> big boys like R, though the other one is still used for
>> >>> moderate-sized games.
>> >>>
>> >>> Would there be much of a time penalty in implementing
>> >>> a 'for' loop, C-style, as
>> >>>
>> >>>  i<-1
>> >>>  while(i<=1e7){
>> >>>    ...
>> >>>    i<-i+1
>> >>>  }
>> >>>
>> >>> ??
>> >>>
>> >>> It looks as though there might be:
>> >>>
>> >>>  system.time(for(i in (1:1e7)) x<-cos(3) )
>> >>>  #[1] 13.521  0.132 13.355  0.000  0.000
>> >>>  system.time({i<-1;while(i<=1e7){x<-cos(3);i<-i+1}})
>> >>>  #[1] 38.270  0.076 37.629  0.000  0.000
>> >>>
>> >>> which suggests that the latter is about 3 times as slow.
>> >>> (And no, this wasn't done on either of my puny babes).
>> >>>
>> >>> (And this isn't the first time I've wished for an R
>> >>> implementation of "++" as a CPU-level incrementation,
>> >>> as opposed to the R-arithmetic implementation which
>> >>> treats "adding 1 to a variable" as a full-dress
>> >>> arithmetic parade!
>> >>>
>> >>> Best wishes,
>> >>> Ted.
>> >>>
>> >>> --------------------------------------------------------------------
>> >>> E-Mail: (Ted Harding) <Ted.Harding at manchester.ac.uk>
>> >>> Fax-to-email: +44 (0)870 094 0861
>> >>> Date: 28-Jan-08                                       Time: 23:34:52
>> >>> ------------------------------ XFMail ------------------------------
>> >>>
>> >>>
>> >>
>> >> ----------------------------------------------------------------
>> >> Ce message a ete envoye par IMP, grace a l'Universite Paris 10 Nanterre
>> >>
>> >>
>> >> ______________________________________________
>> >> R-help at r-project.org mailing list
>> >> https://stat.ethz.ch/mailman/listinfo/r-help
>> >> PLEASE do read the posting guide 
>> http://www.R-project.org/posting-guide.html
>> >> and provide commented, minimal, self-contained, reproducible code.
>> >>
>> >>
>> >
>> >
>>
>>
>



----------------------------------------------------------------
Ce message a ete envoye par IMP, grace a l'Universite Paris 10 Nanterre



More information about the R-help mailing list