[R] The L Word
William Dunlap
wdunlap at tibco.com
Thu Feb 24 19:20:54 CET 2011
> -----Original Message-----
> From: r-help-bounces at r-project.org
> [mailto:r-help-bounces at r-project.org] On Behalf Of Martin Maechler
> Sent: Thursday, February 24, 2011 7:45 AM
> To: Claudia Beleites
> Cc: r-help at r-project.org
> Subject: Re: [R] The L Word
>
> >>>>> "CB" == Claudia Beleites <cbeleites at units.it>
> >>>>> on Thu, 24 Feb 2011 12:31:55 +0100 writes:
>
> CB> On 02/24/2011 11:20 AM, Prof Brian Ripley wrote:
> >> On Thu, 24 Feb 2011, Tal Galili wrote:
> >>
> >>> Thank you all for the answers.
> >>>
> >>> So if I may extend on the question -
> >>> When is it important to use 'Literal integer'?
> >>> Under what situations could not using it cause problems?
> >>> Is it a matter of efficiency or precision or both?
> >>
> >> Efficiency: it avoids unnecessary type conversions. For example
> >>
> >> length(x) > 1
> >>
> >> has to coerce the lhs to double. We have converted the base
> >> code to use integer constants because such small efficiency
> >> gains can add up.
> >>
> >> Integer vectors can be stored more compactly than doubles, but
> >> that is not going to help for length 1:
> >>
> >>> object.size(1)
> >> 48 bytes
> >>> object.size(1L)
> >> 48 bytes
> >> (32-bit system).
> CB> see:
>
> CB> n <- 0L : 100L
>
> CB> szi <- sapply (n, function (n) object.size (integer (n)))
> CB> szd <- sapply (n, function (n) object.size (double (n)))
> CB> plot (n, szd)
> CB> points (n, szi, col = "red")
>
> yes.
>
> Note however that I've never seen evidence for a *practical*
> difference in simple cases, and also of such cases as part of a
> larger computation.
> But I'm happy to see one if anyone has an interesting example.
I don't know how interesting this example is, but I use <digits>L
when combining a scalar with what I know is an integer vector so
I don't unnecessarily change its type. Also, if I have a function
that returns an integer vector in general cases but a special value
like NA or -1 in unusual cases, I would use NA_integer_ or -1L for those
special cases so the function returns the same class of data in
all cases. These things can be important when trying to write a
faster/better version of a builtin function, where I
want to make the new output exactly match the original.
E.g., here is a function that does exactly what sequence() does
but is about 10 times faster for long input vectors (say
seq_len(1e6)%%4L):
Sequence.L <- function (nvec)
{
seq_len(sum(nvec)) - rep(cumsum(c(0L, nvec[-length(nvec)])), nvec)
}
If I change the 0L to 0.0 (or 0) then its result is no longer identical
to sequence's result.
(If I were designing a new data analysis language from scratch, I'd
be tempted to omit the integer type and make all numbers 64-bit doubles,
logicals 1 byte or 2 bit things, and maybe throw in some integral
types for image processing but not for general use. 32-bit integers
are pretty limiting.)
Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com
>
> E.g., I would typically never use 0L:100L instead of 0:100
> in an R script because I think code readability (and self
> explainability) is of considerable importance too.
>
> Martin
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
More information about the R-help
mailing list