[R] Compressing a sequence
Bert Gunter
bgunter@4567 @end|ng |rom gm@||@com
Sat Feb 22 23:23:22 CET 2025
Hi Ben:
I realize that for the OP whether it takes 1/2 second or 1 microsecond to
do what he wants may be irrelevant, but just for fun I thought I'd time the
condense() function you found vs. the compr() function I worked out, which
are similar in their approach.
compr <- function(x, sep ="-")
{
left<- c(TRUE, diff(x) != 1)
right <- c(rev(diff(rev(x)) != -1 ), TRUE)
xleft <- x[left]
xright <- x[right]
ifelse(xleft == xright,
xleft,
paste(xleft, xright, sep = sep))
}
Results were:
set.seed(4456)
x <- sort(sample(seq_len(1000), 800))
all(compr(x) == condense(x)) ##TRUE
library(microbenchmark)
> microbenchmark(compr(x), condense(x), times = 500)
Unit: microseconds
expr min lq mean median uq max neval
compr(x) 65.067 67.773 70.45071 69.0235 71.1145 113.857 500
condense(x) 1022.089 1031.334 1116.02812 1036.2135 1049.8050 5249.271 500
As usual, ymmv, but the difference is due to vectorized and unvectorized
(tapply) computation, of course.
Cheers,
Bert
"An educated person is one who can entertain new ideas, entertain others,
and entertain herself."
On Fri, Feb 21, 2025 at 6:27 PM Ben Bolker <bbolker using gmail.com> wrote:
> And some more from 2013:
>
> https://stackoverflow.com/questions/14868406/collapse-continuous-integer-runs-to-strings-of-ranges
>
> Can be as short as:
>
> condense <- function(x)
> unname(tapply(x, c(0, cumsum(diff(x) != 1)), FUN = function(y)
> paste(unique(range(y)), collapse = "-")
> ))
>
> z <- c(1, 3, 4, 5, 7, 8, 12, 13, 14, 15, 20)
>
> condense(z) |> paste(collapse = ", ")
>
> "1, 3-5, 7-8, 12-15, 20"
>
> On 2025-02-21 9:16 p.m., Ben Bolker wrote:
> > There are some answers from 2016 here:
> >
> > https://stackoverflow.com/questions/34636461/collapse-consecutive-runs-
> > of-numbers-to-a-string-of-ranges
> >
> > On 2025-02-21 7:59 p.m., Steven Ellis wrote:
> >> Hi Dennis,
> >>
> >> A quick Claude request:
> >>
> >> "using r I have a sequence like: 1, 3, 4, 5, 7, 8, 12, 13, 14,
> 15,
> >> 20I would like to display it as: 1, 3-5, 7-8, 12-15, 20"
> >>
> >> yielded:
> >>
> >> condense_sequence <- function(nums) {
> >> if (length(nums) == 0) return("")
> >> if (length(nums) == 1) return(as.character(nums))
> >>
> >> # Sort the numbers just in case they're not in order
> >> nums <- sort(unique(nums))
> >>
> >> # Initialize variables
> >> ranges <- vector("character")
> >> start <- nums[1]
> >> prev <- nums[1]
> >>
> >> for (i in 2:length(nums)) {
> >> if (nums[i] != prev + 1) {
> >> # End of a sequence
> >> if (start == prev) {
> >> ranges <- c(ranges, as.character(start))
> >> } else {
> >> ranges <- c(ranges, paste(start, prev, sep="-"))
> >> }
> >> start <- nums[i]
> >> }
> >> prev <- nums[i]
> >> }
> >>
> >> # Handle the last number or range
> >> if (start == prev) {
> >> ranges <- c(ranges, as.character(start))
> >> } else {
> >> ranges <- c(ranges, paste(start, prev, sep="-"))
> >> }
> >>
> >> # Join all ranges with commas
> >> paste(ranges, collapse=", ")
> >> }
> >>
> >> # Your sequence
> >> nums <- c(1, 3, 4, 5, 7, 8, 12, 13, 14, 15, 20)
> >>
> >> # Apply the function
> >> result <- condense_sequence(nums)
> >> print(result)
> >> # Output: "1, 3-5, 7-8, 12-15, 20"
> >>
> >> Which appears to work well, though you may have other thoughts in mind /
> >> edge cases this code does not cover.
> >>
> >> Best,
> >> Steven
> >>
> >> On Fri, Feb 21, 2025 at 7:47 PM Dennis Fisher <fisher using plessthan.com>
> >> wrote:
> >>
> >>> R 4.4.0
> >>> OS X
> >>>
> >>> Colleagues
> >>>
> >>> I have a sequence like:
> >>> 1, 3, 4, 5, 7, 8, 12, 13, 14, 15, 20
> >>>
> >>> I would like to display it as:
> >>> 1, 3-5, 7-8, 12-15, 20
> >>>
> >>> Any simple ways to accomplish this?
> >>>
> >>> Dennis
> >>>
> >>>
> >>> Dennis Fisher MD
> >>> P < (The "P Less Than" Company)
> >>> Phone / Fax: 1-866-PLessThan (1-866-753-7784)
> >>> www.PLessThan.com
> >>>
> >>>
> >>> [[alternative HTML version deleted]]
> >>>
> >>> ______________________________________________
> >>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >>> https://stat.ethz.ch/mailman/listinfo/r-help
> >>> PLEASE do read the posting guide
> >>> https://www.R-project.org/posting-guide.html
> >>> and provide commented, minimal, self-contained, reproducible code.
> >>>
> >>
> >> [[alternative HTML version deleted]]
> >>
> >> ______________________________________________
> >> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide https://www.R-project.org/posting-
> >> guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >
>
> --
> Dr. Benjamin Bolker
> Professor, Mathematics & Statistics and Biology, McMaster University
> Director, School of Computational Science and Engineering
> > E-mail is sent at my convenience; I don't expect replies outside of
> working hours.
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> https://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
[[alternative HTML version deleted]]
More information about the R-help
mailing list