[Rd] paste(character(0), collapse="", recycle0=FALSE) should be ""

Gabriel Becker g@bembecker @end|ng |rom gm@||@com
Fri May 22 12:00:21 CEST 2020


Hi Martin et al,



On Thu, May 21, 2020 at 9:42 AM Martin Maechler <maechler using stat.math.ethz.ch>
wrote:

> >>>>> Hervé Pagès
> >>>>>     on Fri, 15 May 2020 13:44:28 -0700 writes:
>
>     > There is still the situation where **both** 'sep' and 'collapse' are
>     > specified:
>
>     >> paste(integer(0), "nth", sep="", collapse=",")
>     > [1] "nth"
>
>     > In that case 'recycle0' should **not** be ignored i.e.
>
>     > paste(integer(0), "nth", sep="", collapse=",", recycle0=TRUE)
>
>     > should return the empty string (and not character(0) like it does at
> the
>     > moment).
>
>     > In other words, 'recycle0' should only control the first operation
> (the
>     > operation controlled by 'sep'). Which makes plenty of sense: the 1st
>     > operation is binary (or n-ary) while the collapse operation is
> unary.
>     > There is no concept of recycling in the context of unary operations.
>
> Interesting, ..., and sounding somewhat convincing.
>
>     > On 5/15/20 11:25, Gabriel Becker wrote:
>     >> Hi all,
>     >>
>     >> This makes sense to me, but I would think that recycle0 and
> collapse
>     >> should actually be incompatible and paste should throw an error if
>     >> recycle0 were TRUE and collapse were declared in the same call. I
> don't
>     >> think the value of recycle0 should be silently ignored if it is
> actively
>     >> specified.
>     >>
>     >> ~G
>
> Just to summarize what I think we should know and agree (or be
> be "disproven") and where this comes from ...
>
> 1) recycle0 is a new R 4.0.0 option in paste() / paste0() which by default
>    (recycle0 = FALSE) should (and *does* AFAIK) not change anything,
>    hence  paste() / paste0() behave completely back-compatible
>    if recycle0 is kept to FALSE.
>
> 2) recycle0 = TRUE is meant to give different behavior, notably
>    0-length arguments (among '...') should result in 0-length results.
>
>    The above does not specify what this means in detail, see 3)
>
> 3) The current R 4.0.0 implementation (for which I'm primarily responsible)
>    and help(paste)  are in accordance.
>    Notably the help page (Arguments -> 'recycle0' ; Details 1st para ;
> Examples)
>    says and shows how the 4.0.0 implementation has been meant to work.
>
> 4) Several provenly smart members of the R community argue that
>    both the implementation and the documentation of 'recycle0 =
>    TRUE'  should be changed to be more logical / coherent / sensical ..
>
> Is the above all correct in your view?
>
> Assuming yes,  I read basically two proposals, both agreeing
> that  recycle0 = TRUE  should only ever apply to the action of 'sep'
> but not the action of 'collapse'.
>
> 1) Bill and Hervé (I think) propose that 'recycle0' should have
>    no effect whenever  'collapse = <string>'
>
> 2) Gabe proposes that 'collapse = <string>' and 'recycle0 = TRUE'
>    should be declared incompatible and error. If going in that
>    direction, I could also see them to give a warning (and
>    continue as if recycle = FALSE).
>

Herve makes a good point about when sep and collapse are both set. That
said, if the user explicitly sets recycle0, Personally, I don't think it
should be silently ignored under any configuration of other arguments.

If all of the arguments are to go into effect, the question then becomes
one of ordering, I think.

Consider

paste(c("a", "b"), NULL, c("c",  "d"),  sep = " ", collapse = ",",
recycle0=TRUE)

Currently that returns character(0), becuase the logic is essenttially (in
pseudo-code)

collapse(paste(c("a", "b"), NULL, c("c",  "d"),  sep = " ",
recycle0=TRUE), collapse = ", ", recycle0=TRUE)

     -> collapse(character(0), collapse = ", " recycle0=TRUE)

-> character(0)

Now Bill Dunlap argued, fairly convincingly I think, that paste(...,
collapse=<string>) should *always* return a character vector of length
exactly one. With recycle0, though,  it will return "" via the progression

paste(c("a", "b"), NULL, c("c",  "d"),  sep = " ", collapse = ",",
recycle0=TRUE)

     -> collapse(character(0), collapse = ", ")

-> ""


because recycle0 is still applied to the sep-based operation which occurs
before collapse, thus leaving a vector of length 0 to collapse.

That is consistent but seems unlikely to be what the user wanted, imho. I
think if it does this there should be at least a warning when paste
collapses to "" this way, if it is allowed at all (ie if mixing
collapse=<string> and recycle0=TRUE is not simply made an error).

I would like to hear others' thoughts as well though. @Pages, Herve
<hpages using fredhutch.org> @William Dunlap <wdunlap using tibco.com> is "" what you
envision as thee desired and useful behavior there?

Best,
~G



> I have not yet my mind up but would tend to agree to "you guys",
> but I think that other R Core members should chime in, too.
>
> Martin
>
>     >> On Fri, May 15, 2020 at 11:05 AM Hervé Pagès <hpages using fredhutch.org
>     >> <mailto:hpages using fredhutch.org>> wrote:
>     >>
>     >> Totally agree with that.
>     >>
>     >> H.
>     >>
>     >> On 5/15/20 10:34, William Dunlap via R-devel wrote:
>     >> > I agree: paste(collapse="something", ...) should always return a
>     >> single
>     >> > character string, regardless of the value of recycle0.  This
> would be
>     >> > similar to when there are no non-NULL arguments to paste;
>     >> collapse="."
>     >> > gives a single empty string and collapse=NULL gives a zero long
>     >> character
>     >> > vector.
>     >> >> paste()
>     >> > character(0)
>     >> >> paste(collapse=", ")
>     >> > [1] ""
>     >> >
>     >> > Bill Dunlap
>     >> > TIBCO Software
>     >> > wdunlap tibco.com
>     >> <
> https://urldefense.proofpoint.com/v2/url?u=http-3A__tibco.com&d=DwMFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=cC2qctlVXd0qHMPvCyYvuVMqR8GU3DjTTqKJ0zjIFj8&s=rXIwWqf4U4HZS_bjUT3KfA9ARaV5YTb_kEcXWHnkt-c&e=
> >
>     >> >
>     >> >
>     >> > On Thu, Apr 30, 2020 at 9:56 PM suharto_anggono--- via R-devel <
>     >> > r-devel using r-project.org <mailto:r-devel using r-project.org>> wrote:
>     >> >
>     >> >> Without 'collapse', 'paste' pastes (concatenates) its arguments
>     >> >> elementwise (separated by 'sep', " " by default). New in R devel
>     >> and R
>     >> >> patched, specifying recycle0 = FALSE makes mixing zero-length and
>     >> >> nonzero-length arguments results in length zero. The result of
>     >> paste(n,
>     >> >> "th", sep = "", recycle0 = FALSE) always have the same length as
>     >> 'n'.
>     >> >> Previously, the result is still as long as the longest argument,
>     >> with the
>     >> >> zero-length argument like "". If all og the arguments have
>     >> length zero,
>     >> >> 'recycle0' doesn't matter.
>     >> >>
>     >> >> As far as I understand, 'paste' with 'collapse' as a character
>     >> string is
>     >> >> supposed to put together elements of a vector into a single
>     >> character
>     >> >> string. I think 'recycle0' shouldn't change it.
>     >> >>
>     >> >> In current R devel and R patched, paste(character(0), collapse =
> "",
>     >> >> recycle0 = FALSE) is character(0). I think it should be "", like
>     >> >> paste(character(0), collapse="").
>     >> >>
>     >> >> paste(c("4", "5"), "th", sep = "", collapse = ", ", recycle0 =
>     >> FALSE)
>     >> >> is
>     >> >> "4th, 5th".
>     >> >> paste(c("4"     ), "th", sep = "", collapse = ", ", recycle0 =
>     >> FALSE)
>     >> >> is
>     >> >> "4th".
>     >> >> I think
>     >> >> paste(c(        ), "th", sep = "", collapse = ", ", recycle0 =
>     >> FALSE)
>     >> >> should be
>     >> >> "",
>     >> >> not character(0).
>     >> >>
>     >> >> ______________________________________________
>     >> >> R-devel using r-project.org <mailto:R-devel using r-project.org> mailing
> list
>     >> >>
>     >>
> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=776IovW06eUHr1EDrabHLY7F47rU9CCUEItSDI96zc0&s=xN84DhkZeoxzn6SG0QTMpOGg2w_ThmjZmZymGUuD0Uw&e=
>     >> >>
>     >> >
>     >> >       [[alternative HTML version deleted]]
>     >> >
>     >> > ______________________________________________
>     >> > R-devel using r-project.org <mailto:R-devel using r-project.org> mailing list
>     >> >
>     >>
> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=776IovW06eUHr1EDrabHLY7F47rU9CCUEItSDI96zc0&s=xN84DhkZeoxzn6SG0QTMpOGg2w_ThmjZmZymGUuD0Uw&e=
>     >> >
>     >>
>     >> --
>     >> Hervé Pagès
>     >>
>     >> Program in Computational Biology
>     >> Division of Public Health Sciences
>     >> Fred Hutchinson Cancer Research Center
>     >> 1100 Fairview Ave. N, M1-B514
>     >> P.O. Box 19024
>     >> Seattle, WA 98109-1024
>     >>
>     >> E-mail: hpages using fredhutch.org <mailto:hpages using fredhutch.org>
>     >> Phone:  (206) 667-5791
>     >> Fax:    (206) 667-1319
>     >>
>     >> ______________________________________________
>     >> R-devel using r-project.org <mailto:R-devel using r-project.org> mailing list
>     >> https://stat.ethz.ch/mailman/listinfo/r-devel
>     >> <
> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwMFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=cC2qctlVXd0qHMPvCyYvuVMqR8GU3DjTTqKJ0zjIFj8&s=COnDeGgHNnHJlLLZOznMlhcaFU1nIRlkaSbssvlrMvw&e=
> >
>     >>
>
>     > --
>     > Hervé Pagès
>
>     > Program in Computational Biology
>     > Division of Public Health Sciences
>     > Fred Hutchinson Cancer Research Center
>     > 1100 Fairview Ave. N, M1-B514
>     > P.O. Box 19024
>     > Seattle, WA 98109-1024
>
>     > E-mail: hpages using fredhutch.org
>     > Phone:  (206) 667-5791
>     > Fax:    (206) 667-1319
>
>     > ______________________________________________
>     > R-devel using r-project.org mailing list
>     > https://stat.ethz.ch/mailman/listinfo/r-devel
>

	[[alternative HTML version deleted]]



More information about the R-devel mailing list