[Rd] sequence(c(2, 0, 3)) produces surprising results, would like output length to be sum(input) (PR#9811)
Bill Dunlap
bill at insightful.com
Thu Jul 26 20:37:53 CEST 2007
On Thu, 26 Jul 2007 bill at insightful.com wrote:
> Full_Name: Bill Dunlap
> Version: 2.5.0
> OS: Linux
> Submission from: (NULL) (70.98.76.47)
>
> sequence(nvec) is documented to return
> the concatenation of seq(nvec[i]), for
> i in seq(along=nvec). This produces inconvenient
> (for me) results for 0 inputs.
> > sequence(c(2,0,3)) # would like 1 2 1 2 3, ignore 0
> [1] 1 2 1 0 1 2 3
> Would changing sequence(nvec) to use seq_len(nvec[i])
> instead of the current 1:nvec[i] break much existing code?
>
> On the other hand, almost no one seems to use sequence()
> and it might make more sense to allow seq_len() and seq()
> to accept a vector for length.out and they would return a
> vector of length sum(length.out),
> c(seq_len(length.out[1]), seq_len(length.out[2]), ...)
seq_len() could be changed to do that with the following
code change. It does slow down seq_len in the scalar case
old time new time
for(i in 1:1e6)seq_len(2) 1.251 1.516
for(i in 1:1e6)seq_len(20) 1.690 1.990
for(i in 1:1e6)seq_len(200) 5.480 5.860
It becomes much faster than sequence in the vectorized case.
> unix.time(for(i in 1:1e4)sequence(20:1))
user system elapsed
1.550 0.000 1.557
> unix.time(for(i in 1:1e4)seq_len(20:1))
user system elapsed
0.070 0.000 0.066
> identical(sequence(20:1), seq_len(20:1))
[1] TRUE
My problem cases are where the length.out vector is long
and contains small integers (e.g., the output of table
on a vector of mostly unique values).
Index: src/main/seq.c
===================================================================
--- src/main/seq.c (revision 42329)
+++ src/main/seq.c (working copy)
@@ -594,16 +594,31 @@
SEXP attribute_hidden do_seq_len(SEXP call, SEXP op, SEXP args, SEXP rho)
{
- SEXP ans;
- int i, len, *p;
+ SEXP ans, slengths;
+ int i, *p, anslen, *lens, nlens, ilen, nprotected=0 ;
checkArity(op, args);
- len = asInteger(CAR(args));
- if(len == NA_INTEGER || len < 0)
- errorcall(call, _("argument must be non-negative"));
- ans = allocVector(INTSXP, len);
+ slengths = CAR(args);
+ if (TYPEOF(slengths) != INTSXP) {
+ PROTECT(slengths = coerceVector(CAR(args), INTSXP));
+ nprotected++;
+ }
+ lens = INTEGER(slengths);
+ nlens = LENGTH(slengths);
+ anslen = 0 ;
+ for(ilen=0;ilen<nlens;ilen++) {
+ int len = lens[ilen] ;
+ if(len == NA_INTEGER || len < 0)
+ errorcall(call, _("argument must be non-negative"));
+ anslen += len ;
+ }
+ ans = allocVector(INTSXP, anslen);
p = INTEGER(ans);
- for(i = 0; i < len; i++) p[i] = i+1;
-
+ for(ilen=0;ilen<nlens;ilen++) {
+ int len = lens[ilen] ;
+ for(i = 0; i < len; i++) *p++ = i+1;
+ }
+ if(nprotected>0)
+ UNPROTECT(nprotected);
return ans;
}
More information about the R-devel
mailing list