[Rd] base::order making available retGrp and sortStr options for radix method?

Sebastian Martin Krantz @eb@@t|@n@kr@ntz @end|ng |rom gr@du@te|n@t|tute@ch
Fri May 8 23:01:39 CEST 2020


Hi together,

a bit more than a month ago I have released the 'collapse' package for
advanced and fast data transformation in R with an array of fast grouped
and weighted functions and facilities for efficient grouped programming in
R.

As I am preparing the next update of this package I have come across the
following: For grouping, 'collapse' uses the function 'GRP', and efficient
wrapper around data.table:::forderv for fast radix sort based grouping. To
do this the source code for forderv was copied and deparallelized. Now I
realized that an earlier deparallelized version of forderv is already fully
available in base R:
https://github.com/wch/r-source/blob/5a156a0865362bb8381dcd69ac335f5174a4f60c/src/main/radixsort.c

This function is called in base::order(..., method = "radix"). I was mildly
aware that data.table ordering has made it into base R but I first thought
the grouping feature of forder had been removed. However in fact it is
there but disabled. base::order lines 31-35 reads:

  if (method == "radix") {
    decreasing <- rep_len(as.logical(decreasing), length(z))
    return(.Internal(radixsort(na.last, decreasing, FALSE,
      TRUE, ...)))
  }

which is essentially:     return(.Internal(radixsort(na.last, decreasing,
retGrp,
      sortStr, ...))) with the retGrp arguments which returns the group
starts and the maximum group size disabled. sortStr = FALSE can be used to
do unordered groupings.

My request is if it is possible to make available these features to the
user. It would make available extremely fast ordered grouping facilities to
all developers and prevent the need for people like myself to copy this
source code. In R it could be made available through a simple function like:

radixorder <- function(..., na.last = TRUE, decreasing = FALSE, retGrp =
FALSE,  sortStr = TRUE) {
  z <- list(...)
  decreasing <- rep_len(as.logical(decreasing), length(z))
  return(.Internal(radixsort(na.last, decreasing, retGRP,
                             otharg, ...)))
}

Alternatively a macro in the C API like R_orderVector i.e.
R_orderVectorRadix would be great.

Best regards,

Sebastian



<https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>
Virenfrei.
www.avast.com
<https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>
<#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>

	[[alternative HTML version deleted]]



More information about the R-devel mailing list