[R] Compressing a sequence

Dennis Fisher ||@her @end|ng |rom p|e@@th@n@com
Sat Feb 22 19:49:07 CET 2025


Colleagues

I appreciate the many suggestions for my query about collapsing a sequence to a shortened string.  

I finally arrived at:

format_ranges <- function(x)
        {                       
	 x                  <- sort(unique(x))          
	breaks          <- which(diff(x) > 1)
        start_idx       <- c(1, breaks + 1)
        end_idx        <- c(breaks, length(x))
        ranges          <- mapply(function(s, e)
                {
                if (s == e)	return(as.character(x[s]))
                else        	return(paste(x[s], x[e], sep="-"))
                }, start_idx, end_idx)
        paste(ranges, collapse=", ")
        }

numbers <- c(1, 3, 4, 5, 7, 8, 12, 13, 14, 15, 20)
cat("original sequence is:\t\t", numbers, "\n")
cat("collapsed text string is:\t", format_ranges(numbers), "\n")

Dennis

Dennis Fisher MD
P < (The "P Less Than" Company)
Phone / Fax: 1-866-PLessThan (1-866-753-7784)
www.PLessThan.com

> On Feb 22, 2025, at 8:14 AM, Bert Gunter <bgunter.4567 using gmail.com> wrote:
> 
> Well, as I predicted, my initial suggestions were, ... ummm, rather dumb. Also, Rui's suggestions are probably preferable to the below. However, it *is* a very simple, bare-boned approach to converting a sequence of increasing integers to a character representation using interval notation. The code is *very* simple-minded and needs no explanation, I think. You should also be able to easily tweak it to change the output format from my paste0(..., collapse) choice to something else.
> 
> compr <- function(x, sep ="-", collapse = ", ")
> {
>    left<- c(TRUE, diff(x) != 1)
>    right <- c(rev(diff(rev(x)) != -1 ), TRUE)
>    xleft <- x[left]
>    xright <- x[right]
>    ifelse(xleft == xright, 
>             xleft, 
>             paste(xleft, xright, sep = sep)) |>
>        paste0(collapse = collapse)
> }
> 
> which yields the following with your example
> 
> > compr(c( 1, 3, 4, 5, 7, 8, 12, 13, 14, 15, 20))
> [1] "1, 3-5, 7-8, 12-15, 20"   ##Note: A single character string
> 
> (Do let me know if this can fail)
>  Cheers,
> Bert
> 
> "An educated person is one who can entertain new ideas, entertain others, and entertain herself."
> 
> 
> 
> On Sat, Feb 22, 2025 at 4:36 AM Rui Barradas <ruipbarradas using sapo.pt> wrote:
> Às 00:46 de 22/02/2025, Dennis Fisher escreveu:
> > R 4.4.0
> > OS X
> > 
> > Colleagues
> > 
> > I have a sequence like:
> >       1, 3, 4, 5, 7, 8, 12, 13, 14, 15, 20
> > 
> > I would like to display it as:
> >       1, 3-5, 7-8, 12-15, 20
> > 
> > Any simple ways to accomplish this?
> > 
> > Dennis
> > 
> > 
> > Dennis Fisher MD
> > P < (The "P Less Than" Company)
> > Phone / Fax: 1-866-PLessThan (1-866-753-7784)
> > www.PLessThan.com
> > 
> > 
> >       [[alternative HTML version deleted]]
> > 
> > ______________________________________________
> > R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> Hello,
> 
> Here is a way with package R.utils, function seqToIntervals.
> 
> 
> x <- scan(text = "1, 3, 4, 5, 7, 8, 12, 13, 14, 15, 20", sep = ",")
> 
> mat <- R.utils::seqToIntervals(x)
> apply(mat, 1L, \(m) {
>    ifelse(m[1L] == m[2L], m[1L], paste(m, collapse = "-"))
> })
> #> [1] "1"     "3-5"   "7-8"   "12-15" "20"
> 
> 
> If you want to be fancy, define a special class that prints like that.
> 
> 
> 
> x <- scan(text = "1, 3, 4, 5, 7, 8, 12, 13, 14, 15, 20", sep = ",")
> 
> as_seqInterval <- function(x) {
>    old_class <- class(x)
>    class(x) <- c("seqInterval", old_class)
>    x
> }
> print.seqInterval <- function(x, ...) {
>    mat <- R.utils::seqToIntervals(x)
>    out <- apply(mat, 1L, \(m) {
>      ifelse(m[1L] == m[2L], m[1L], paste(m, collapse = "-"))
>    })
>    print(out)
> }
> 
> y <- as_seqInterval(x)
> class(y)
> #> [1] "seqInterval" "numeric"
> 
> # autoprinting y
> y
> #> [1] "1"     "3-5"   "7-8"   "12-15" "20"
> 
> # explicit printing y
> print(y)
> #> [1] "1"     "3-5"   "7-8"   "12-15" "20"
> 
> 
> Hope this helps,
> 
> Rui Barradas
> 
> 
> -- 
> Este e-mail foi analisado pelo software antivírus AVG para verificar a presença de vírus.
> www.avg.com
> 
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list