[R] Is there a sexy way ...?

@vi@e@gross m@iii@g oii gm@ii@com @vi@e@gross m@iii@g oii gm@ii@com
Sat Sep 28 20:58:25 CEST 2024


John,

I thought some more about the topic overnight. Of course, "sexy" is not a
great analogy.

But consider a concept of how to do something cleverly or creatively or in
ways others might not easily come up with as the standard way(s) are
commonly used.

I threw something together in middle of the night that violated some rules
of sorts but maybe not. I mean the standard ways of dealing with what looked
like integers or at least of type numeric, is to leave them as integers. My
earlier attempt did that as it converted a list of vectors into something
else like a data.frame or matrix.

My later attempt jumped out of the box. This time I focused on the concept
of the python zip function that takes multiple iterables and weaves them
into an iterable of n-tuples. 

My first though was that among many other things, paste/paste0 does that.
There is also a family of functions in the map family and one called pmap
that takes multiple vectors and applies a function to them. An example
calculates the rowwise sum, of sorts, not that we need this way:

> pmap(.l = list(1:4, 2:5, 3:6), .f=sum)
[[1]]
[1] 6

[[2]]
[1] 9

[[3]]
[1] 12

[[4]]
[1] 15

If what you want is to refer to the arguments directly, as someone pointed
out, you can use an anonymous function like so to produce a 3-tuple:

> pmap(.l = list(1:4, 2:5, 3:6), .f=\(a, b, c) c(a, b, c))
[[1]]
[1] 1 2 3

[[2]]
[1] 2 3 4

[[3]]
[1] 3 4 5

[[4]]
[1] 4 5 6


Or use a formula with positional notation that gets the same output:

pmap(.l = list(1:4, 2:5, 3:6), ~ c(..1, ..2, ..3))

I have mostly used pmap directly on dataframes but it works fine on any list
of vectors. So, since the x we have been using is a list of vectors, this
works:

x <- list(`1` = c(7, 13, 1, 4, 10),
          `2` = c(2, 5,  14, 8, 11),
          `3` = c(6, 9, 15, 12, 3))

pmap(.l = x, ~ c(..1, ..2, ..3))

The result though is not flat, so unlist it:

> pmap(.l = x, ~ c(..1, ..2, ..3)) |> unlist()
 [1]  7  2  6 13  5  9  1 14 15  4  8 12 10 11  3

I would consider this a relatively direct and simple answer, except it does
not deal with the factor issue that was later explained. A problem is that
the above has three items hard-coded and needs changes to work with an
arbitrary number of columns of data. It can be done, just less elegantly. 

But I was not thinking about pmap at night, I was thinking about paste and
unfortunately, paste is really more about TEXT. Everything is converted to
text so the operations I would need to do would be manipulating text.

I later realized my earlier work was too elaborate. Paste allows you to both
make comma separated parts and then combine them into one big string with a
separator. In this case, comma.

So here is the new and hopefully shorter and more efficient version instead
of two paste statements in a row:

> paste(x$`1`, x$`2`, x$`3`, sep=",", collapse=",")
[1] "7,2,6,13,5,9,1,14,15,4,8,12,10,11,3"

Of course, to accommodate any number of vectors to combine, and without
needing to know their names or specify positions, the do.call concept that
expands the contents into individual arguments, is great:

do.call(paste, c(x, sep=",", collapse=","))

All that is needed is to deal taking a string with lots of commas and
resurrecting it:

do.call(paste, c(x, sep=",", collapse=",")) |>
  strsplit(",") |>
  unlist() |>
  as.integer()

Or without the new R pipe:

as.integer(unlist(strsplit(do.call(paste, 
                                   c(x, 
                                     sep=",", 
                                     collapse=",")),
                           ",")))

The above looks better in a constant width font, LOL!

I will say it is annoying why a version of strsplit is not easily available
that does something more trivial. Given a single string and a fixed
separator such as a space or comma, meaning no regular expression, break it
up into a vector containing the parts. No need to unlist. Just take the
result and make it integer or whatever you need. It looks like a fairly
trivial function to make.

I will add one last idea and perhaps let this thread wane. Some things are a
bit like religion to people or a matter of taste. There are arguments
ranging in another forum on why a language does not have some form of loop
like DO ... UNTIL because some people HATE using break statements even when
they make perfect sense. Others are happier with a while(True) construct
that makes clear the contents will decide when to break out. To expect
people to agree on what is "sexy" is not a reasonable expectation. LOL!


-----Original Message-----
From: Sorkin, John <jsorkin using som.umaryland.edu> 
Sent: Saturday, September 28, 2024 12:01 AM
To: avi.e.gross using gmail.com; 'Rolf Turner' <rolfturner using posteo.net>;
r-help using r-project.org
Subject: Re: [R] Is there a sexy way ...?

"Sexy code" may get a job done and demonstrate the code's knowledge of a
programming language, but it often does this at the expense of clear, easy
to document (i.e. annotate what the code does), easy to read, and easy to
understand code. I fear that this is what this thread has developed "sexy"
but not easily understandable code. While I send kudos to all of you,
remember that sometimes simpler, while not as sexy can be better in the long
run. ;)

John David Sorkin M.D., Ph.D.
Professor of Medicine, University of Maryland School of Medicine;
Associate Director for Biostatistics and Informatics, Baltimore VA Medical
Center Geriatrics Research, Education, and Clinical Center;
PI Biostatistics and Informatics Core, University of Maryland School of
Medicine Claude D. Pepper Older Americans Independence Center;
Senior Statistician University of Maryland Center for Vascular Research;

Division of Gerontology and Paliative Care,
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
Cell phone 443-418-5382




________________________________________
From: R-help <r-help-bounces using r-project.org> on behalf of
avi.e.gross using gmail.com <avi.e.gross using gmail.com>
Sent: Friday, September 27, 2024 10:48 PM
To: 'Rolf Turner'; r-help using r-project.org
Subject: Re: [R] Is there a sexy way ...?

Rold,

We need to be clear on what makes an answer sexy! LOL!

I decided it was sexy to do it in a way that nobody (normal) would and had
not suggested yet.

Here is an original version I will explain in a minute. Or, maybe best a bit
before. Hee is the unformatted result whicvh is a tad hard to read but will
be made readable soon:

x <- list(`1` = c(7, 13, 1, 4, 10),
          `2` = c(2, 5,  14, 8, 11),
          `3` = c(6, 9, 15, 12, 3))

as.integer(unlist(strsplit(as.vector(paste(paste(x$`1`, x$`2`, x$`3`,
sep=","), collapse=",")), split=",")))

The result is: 7  2  6 13  5  9  1 14 15  4  8 12 10 11  3

After reading what others wrote, the following is more general one where any
number of vectors in a list can be handled:

as.integer(unlist(strsplit(as.vector(paste(do.call(paste, c(x, sep=",")),
collapse=",")), split=",")))

Perhaps a tad more readable is a version using the new pipe but for obvious
reasons, the dplyr/magrittr pipe works better for me than having to create
silly anonymous functions instead of using a period. You now have a
pipeline:

library(dplyr)

x %>%
  c(sep=",") %>%
  do.call(paste, .) %>%
  paste(collapse=",") %>%
  as.vector() %>%
  strsplit(split=",") %>%
  unlist() %>%
  as.integer()

And it returns the right answer!

- You start with x and pipe it as

- the first argument to c() and the second argument already in place is an
option to later use comma as a separator

- that is piped to a do.call() which takes that c() tuple and replaces the
second argument of period with it. You now have taken the original data and
made three text strings like so:
"7,2,6"   "13,5,9"  "1,14,15" "4,8,12"  "10,11,3"

- But you want all those strings collapsed into a single long string with
commas between the parts. Do another paste this time putting the substrings
together and collapsing with a comma. The results is:
"7,2,6,13,5,9,1,14,15,4,8,12,10,11,3"

- But that is not a vector and don't ask why!

- Now split that string at commas:
"7"  "2"  "6"  "13" "5"  "9"  "1"  "14" "15" "4"  "8"  "12" "10" "11" "3"

- and undo the odd list format it returns to flatten it back into a
character vector:
"7"  "2"  "6"  "13" "5"  "9"  "1"  "14" "15" "4"  "8"  "12" "10" "11" "3"

- Yep it looks the same but is subtly different. Time to make it into
integers or whatever:
7  2  6 13  5  9  1 14 15  4  8 12 10 11  3

Looked at after the fact, it seems so bloody obvious! And the chance of
someone else trying this approach, justifiably, is low, LOL!

One nice feature of the do.call is this can be extended like so:

x <- list(`1` = c(7, 13, 1, 4, 10),
          `2` = c(2, 5,  14, 8, 11),
          `3` = c(6, 9, 15, 12, 3),
          `4` = c( 101, 102, 103, 104, 105),
          `5` = c(-105, -104, -103, -102, -101))

Works fine and does this for the now five columns:

[1]    7    2    6  101 -105   13    5    9  102 -104    1   14   15  103
-103    4    8   12  104 -102
[21]   10   11    3  105 -101

My apologies to all who expected a more serious post. I have been focusing
on Python lately and over there, some things are done differently albeit I
probably would be using the numpy and pandas packages to do this or even a
simple list comprehension using zip:

# Python, not R.
 [ (first, second, third) for first, second, third in zip(*x)]

[(7, 2, 6), (13, 5, 9), (1, 14, 15), (4, 8, 12), (10, 11, 3)]

And, of course, that needs to be made into a list of individual items

# Python, not R.
[num
 for elem in [(first, second, third) for first, second, third in zip(*x)]
 for num in elem]

[7, 2, 6, 13, 5, 9, 1, 14, 15, 4, 8, 12, 10, 11, 3]

For any interested, you can combine python and R in the same program back
and forth on the same data inside what is still called RSTUDIO and if there
are times one allows a better or at least easier for you, way to do a
transformation, you can often mix and match.

-----Original Message-----
From: R-help <r-help-bounces using r-project.org> On Behalf Of Rolf Turner
Sent: Thursday, September 26, 2024 11:56 PM
To: r-help using r-project.org
Subject: [R] Is there a sexy way ...?


I have (toy example):

x <- list(`1` = c(7, 13, 1, 4, 10),
          `2` = c(2, 5,  14, 8, 11),
          `3` = c(6, 9, 15, 12, 3))
and

f <- factor(rep(1:3,5))

I want to create a vector v of length 15 such that the entries of v,
corresponding to level l of f are the entries of x[[l]].  I.e. I want
v to equal

    c(7, 2, 6, 13, 5, 9, 1, 14, 15, 4, 8, 12, 10, 11, 3)

I can create v "easily enough", using say, a for-loop.  It seems to me,
though, that there should be sexier (single command) way of achieving
the desired result.  However I cannot devise one.

Can anyone point me in the right direction?  Thanks.

cheers,

Rolf Turner

--
Honorary Research Fellow
Department of Statistics
University of Auckland
Stats. Dep't. (secretaries) phone:
         +64-9-373-7599 ext. 89622
Home phone: +64-9-480-4619

______________________________________________
R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
https://www.r-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
https://www.r-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list