[Rd] paste(character(0), collapse="", recycle0=FALSE) should be ""

Hervé Pagès hp@ge@ @end|ng |rom |redhutch@org
Fri May 22 18:12:12 CEST 2020


I think that

    paste(c("a", "b"), NULL, c("c",  "d"),  sep = " ", collapse = ",", 
recycle0=TRUE)

should just return an empty string and don't see why it needs to emit a 
warning or raise an error. To me it does exactly what the user is asking 
for, which is to change how the 3 arguments are recycled **before** the 
'sep' operation.

The 'recycle0' argument has no business in the 'collapse' operation 
(which comes after the 'sep' operation): this operation still behaves 
like it always had.

That's all there is to it.

H.


On 5/22/20 03:00, Gabriel Becker wrote:
> Hi Martin et al,
> 
> 
> 
> On Thu, May 21, 2020 at 9:42 AM Martin Maechler 
> <maechler using stat.math.ethz.ch <mailto:maechler using stat.math.ethz.ch>> wrote:
> 
>      >>>>> Hervé Pagès
>      >>>>>     on Fri, 15 May 2020 13:44:28 -0700 writes:
> 
>          > There is still the situation where **both** 'sep' and
>     'collapse' are
>          > specified:
> 
>          >> paste(integer(0), "nth", sep="", collapse=",")
>          > [1] "nth"
> 
>          > In that case 'recycle0' should **not** be ignored i.e.
> 
>          > paste(integer(0), "nth", sep="", collapse=",", recycle0=TRUE)
> 
>          > should return the empty string (and not character(0) like it
>     does at the
>          > moment).
> 
>          > In other words, 'recycle0' should only control the first
>     operation (the
>          > operation controlled by 'sep'). Which makes plenty of sense:
>     the 1st
>          > operation is binary (or n-ary) while the collapse operation
>     is unary.
>          > There is no concept of recycling in the context of unary
>     operations.
> 
>     Interesting, ..., and sounding somewhat convincing.
> 
>          > On 5/15/20 11:25, Gabriel Becker wrote:
>          >> Hi all,
>          >>
>          >> This makes sense to me, but I would think that recycle0 and
>     collapse
>          >> should actually be incompatible and paste should throw an
>     error if
>          >> recycle0 were TRUE and collapse were declared in the same
>     call. I don't
>          >> think the value of recycle0 should be silently ignored if it
>     is actively
>          >> specified.
>          >>
>          >> ~G
> 
>     Just to summarize what I think we should know and agree (or be
>     be "disproven") and where this comes from ...
> 
>     1) recycle0 is a new R 4.0.0 option in paste() / paste0() which by
>     default
>         (recycle0 = FALSE) should (and *does* AFAIK) not change anything,
>         hence  paste() / paste0() behave completely back-compatible
>         if recycle0 is kept to FALSE.
> 
>     2) recycle0 = TRUE is meant to give different behavior, notably
>         0-length arguments (among '...') should result in 0-length results.
> 
>         The above does not specify what this means in detail, see 3)
> 
>     3) The current R 4.0.0 implementation (for which I'm primarily
>     responsible)
>         and help(paste)  are in accordance.
>         Notably the help page (Arguments -> 'recycle0' ; Details 1st
>     para ; Examples)
>         says and shows how the 4.0.0 implementation has been meant to work.
> 
>     4) Several provenly smart members of the R community argue that
>         both the implementation and the documentation of 'recycle0 =
>         TRUE'  should be changed to be more logical / coherent / sensical ..
> 
>     Is the above all correct in your view?
> 
>     Assuming yes,  I read basically two proposals, both agreeing
>     that  recycle0 = TRUE  should only ever apply to the action of 'sep'
>     but not the action of 'collapse'.
> 
>     1) Bill and Hervé (I think) propose that 'recycle0' should have
>         no effect whenever  'collapse = <string>'
> 
>     2) Gabe proposes that 'collapse = <string>' and 'recycle0 = TRUE'
>         should be declared incompatible and error. If going in that
>         direction, I could also see them to give a warning (and
>         continue as if recycle = FALSE).
> 
> 
> Herve makes a good point about when sep and collapse are both set. That 
> said, if the user explicitly sets recycle0, Personally, I don't think it 
> should be silently ignored under any configuration of other arguments.
> 
> If all of the arguments are to go into effect, the question then becomes 
> one of ordering, I think.
> 
> Consider
> 
>     paste(c("a", "b"), NULL, c("c",  "d"),  sep = " ", collapse = ",", 
>     recycle0=TRUE)
> 
> Currently that returns character(0), becuase the logic is 
> essenttially (in pseudo-code)
> 
>     collapse(paste(c("a", "b"), NULL, c("c",  "d"),  sep = " ", 
>     recycle0=TRUE), collapse = ", ", recycle0=TRUE)
> 
>       -> collapse(character(0), collapse = ", " recycle0=TRUE)
> 
>     -> character(0)
> 
> Now Bill Dunlap argued, fairly convincingly I think, that paste(..., 
> collapse=<string>) should /always/ return a character vector of length 
> exactly one. With recycle0, though,  it will return "" via the progression
> 
>     paste(c("a", "b"), NULL, c("c",  "d"),  sep = " ", collapse = ",", 
>     recycle0=TRUE)
> 
>       -> collapse(character(0), collapse = ", ")
> 
>     -> ""
> 
> 
> because recycle0 is still applied to the sep-based operation which 
> occurs before collapse, thus leaving a vector of length 0 to collapse.
> 
> That is consistent but seems unlikely to be what the user wanted, imho. 
> I think if it does this there should be at least a warning when paste 
> collapses to "" this way, if it is allowed at all (ie if mixing 
> collapse=<string>and recycle0=TRUEis not simply made an error).
> 
> I would like to hear others' thoughts as well though. @Pages, Herve 
> <mailto:hpages using fredhutch.org> @William Dunlap 
> <mailto:wdunlap using tibco.com> is "" what you envision as thee desired and 
> useful behavior there?
> 
> Best,
> ~G
> 
> 
> 
>     I have not yet my mind up but would tend to agree to "you guys",
>     but I think that other R Core members should chime in, too.
> 
>     Martin
> 
>          >> On Fri, May 15, 2020 at 11:05 AM Hervé Pagès
>     <hpages using fredhutch.org <mailto:hpages using fredhutch.org>
>          >> <mailto:hpages using fredhutch.org <mailto:hpages using fredhutch.org>>>
>     wrote:
>          >>
>          >> Totally agree with that.
>          >>
>          >> H.
>          >>
>          >> On 5/15/20 10:34, William Dunlap via R-devel wrote:
>          >> > I agree: paste(collapse="something", ...) should always
>     return a
>          >> single
>          >> > character string, regardless of the value of recycle0. 
>     This would be
>          >> > similar to when there are no non-NULL arguments to paste;
>          >> collapse="."
>          >> > gives a single empty string and collapse=NULL gives a zero
>     long
>          >> character
>          >> > vector.
>          >> >> paste()
>          >> > character(0)
>          >> >> paste(collapse=", ")
>          >> > [1] ""
>          >> >
>          >> > Bill Dunlap
>          >> > TIBCO Software
>          >> > wdunlap tibco.com
>     <https://urldefense.proofpoint.com/v2/url?u=http-3A__tibco.com&d=DwMFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=q5ueyHReS5hHK6TZ0dJ1N7Ro8dx-rsLHys8GrCugOls&s=o9ozvxBK-kVvAUFro7U1RrI5w0U8EPb0uyjQwMvOpt8&e=>
>          >>
>     <https://urldefense.proofpoint.com/v2/url?u=http-3A__tibco.com&d=DwMFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=cC2qctlVXd0qHMPvCyYvuVMqR8GU3DjTTqKJ0zjIFj8&s=rXIwWqf4U4HZS_bjUT3KfA9ARaV5YTb_kEcXWHnkt-c&e=>
>          >> >
>          >> >
>          >> > On Thu, Apr 30, 2020 at 9:56 PM suharto_anggono--- via
>     R-devel <
>          >> > r-devel using r-project.org <mailto:r-devel using r-project.org>
>     <mailto:r-devel using r-project.org <mailto:r-devel using r-project.org>>> wrote:
>          >> >
>          >> >> Without 'collapse', 'paste' pastes (concatenates) its
>     arguments
>          >> >> elementwise (separated by 'sep', " " by default). New in
>     R devel
>          >> and R
>          >> >> patched, specifying recycle0 = FALSE makes mixing
>     zero-length and
>          >> >> nonzero-length arguments results in length zero. The
>     result of
>          >> paste(n,
>          >> >> "th", sep = "", recycle0 = FALSE) always have the same
>     length as
>          >> 'n'.
>          >> >> Previously, the result is still as long as the longest
>     argument,
>          >> with the
>          >> >> zero-length argument like "". If all og the arguments have
>          >> length zero,
>          >> >> 'recycle0' doesn't matter.
>          >> >>
>          >> >> As far as I understand, 'paste' with 'collapse' as a
>     character
>          >> string is
>          >> >> supposed to put together elements of a vector into a single
>          >> character
>          >> >> string. I think 'recycle0' shouldn't change it.
>          >> >>
>          >> >> In current R devel and R patched, paste(character(0),
>     collapse = "",
>          >> >> recycle0 = FALSE) is character(0). I think it should be
>     "", like
>          >> >> paste(character(0), collapse="").
>          >> >>
>          >> >> paste(c("4", "5"), "th", sep = "", collapse = ", ",
>     recycle0 =
>          >> FALSE)
>          >> >> is
>          >> >> "4th, 5th".
>          >> >> paste(c("4"     ), "th", sep = "", collapse = ", ",
>     recycle0 =
>          >> FALSE)
>          >> >> is
>          >> >> "4th".
>          >> >> I think
>          >> >> paste(c(        ), "th", sep = "", collapse = ", ",
>     recycle0 =
>          >> FALSE)
>          >> >> should be
>          >> >> "",
>          >> >> not character(0).
>          >> >>
>          >> >> ______________________________________________
>          >> >> R-devel using r-project.org <mailto:R-devel using r-project.org>
>     <mailto:R-devel using r-project.org <mailto:R-devel using r-project.org>>
>     mailing list
>          >> >>
>          >>
>     https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=776IovW06eUHr1EDrabHLY7F47rU9CCUEItSDI96zc0&s=xN84DhkZeoxzn6SG0QTMpOGg2w_ThmjZmZymGUuD0Uw&e=
>          >> >>
>          >> >
>          >> >       [[alternative HTML version deleted]]
>          >> >
>          >> > ______________________________________________
>          >> > R-devel using r-project.org <mailto:R-devel using r-project.org>
>     <mailto:R-devel using r-project.org <mailto:R-devel using r-project.org>>
>     mailing list
>          >> >
>          >>
>     https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=776IovW06eUHr1EDrabHLY7F47rU9CCUEItSDI96zc0&s=xN84DhkZeoxzn6SG0QTMpOGg2w_ThmjZmZymGUuD0Uw&e=
>          >> >
>          >>
>          >> --
>          >> Hervé Pagès
>          >>
>          >> Program in Computational Biology
>          >> Division of Public Health Sciences
>          >> Fred Hutchinson Cancer Research Center
>          >> 1100 Fairview Ave. N, M1-B514
>          >> P.O. Box 19024
>          >> Seattle, WA 98109-1024
>          >>
>          >> E-mail: hpages using fredhutch.org <mailto:hpages using fredhutch.org>
>     <mailto:hpages using fredhutch.org <mailto:hpages using fredhutch.org>>
>          >> Phone:  (206) 667-5791
>          >> Fax:    (206) 667-1319
>          >>
>          >> ______________________________________________
>          >> R-devel using r-project.org <mailto:R-devel using r-project.org>
>     <mailto:R-devel using r-project.org <mailto:R-devel using r-project.org>>
>     mailing list
>          >> https://stat.ethz.ch/mailman/listinfo/r-devel
>     <https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwMFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=q5ueyHReS5hHK6TZ0dJ1N7Ro8dx-rsLHys8GrCugOls&s=OLA7CqaU5uKeid1aGw41XJ_2Uq7JXbcwpPOrTWWG2v4&e=>
>          >>
>     <https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwMFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=cC2qctlVXd0qHMPvCyYvuVMqR8GU3DjTTqKJ0zjIFj8&s=COnDeGgHNnHJlLLZOznMlhcaFU1nIRlkaSbssvlrMvw&e=>
>          >>
> 
>          > --
>          > Hervé Pagès
> 
>          > Program in Computational Biology
>          > Division of Public Health Sciences
>          > Fred Hutchinson Cancer Research Center
>          > 1100 Fairview Ave. N, M1-B514
>          > P.O. Box 19024
>          > Seattle, WA 98109-1024
> 
>          > E-mail: hpages using fredhutch.org <mailto:hpages using fredhutch.org>
>          > Phone:  (206) 667-5791
>          > Fax:    (206) 667-1319
> 
>          > ______________________________________________
>          > R-devel using r-project.org <mailto:R-devel using r-project.org> mailing list
>          > https://stat.ethz.ch/mailman/listinfo/r-devel
>     <https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwMFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=q5ueyHReS5hHK6TZ0dJ1N7Ro8dx-rsLHys8GrCugOls&s=OLA7CqaU5uKeid1aGw41XJ_2Uq7JXbcwpPOrTWWG2v4&e=>
> 

-- 
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpages using fredhutch.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319



More information about the R-devel mailing list