[Rd] [WISH / PATCH] possibility to split string literals across multiple lines

Joris Meys jorismeys at gmail.com
Wed Jun 14 14:35:11 CEST 2017


Hi Mark,

I got you. I just pointed out the obvious to illustrate why your emulation
didn't eliminate the need for the real thing. I didn't mean to imply you
weren't aware of this, even though it may seem so. Sometimes I'm not 100%
aware of the subtleties of the English language. This seems one of those
cases.

Met vriendelijke groeten
Joris

On Wed, Jun 14, 2017 at 2:23 PM, Mark van der Loo <mark.vanderloo at gmail.com>
wrote:

> I know it doesn't cause construction at parse time, and it was also not
> what I said. What I meant was that it makes the syntax at least look a
> little as if you have a line-breaking character within string literals.
>
> Op wo 14 jun. 2017 om 14:18 schreef Joris Meys <jorismeys at gmail.com>:
>
>> Mark, that's actually a fair statement, although your extra operator
>> doesn't cause construction at parse time. You still call paste0(), but just
>> add an extra layer on top of it.
>>
>> I also doubt that even in gigantic loops the benefit is going to be
>> significant. Take following example:
>>
>> atestfun <- function(x){
>>   y <- paste0("a very long",
>>          "string for testing")
>>   grep(x, y)
>> }
>> atestfun2 <- function(x){
>>   y <- "a very long
>> string for testing"
>>   grep(x,y)
>> }
>> cfun <- cmpfun(atestfun)
>> cfun2 <- cmpfun(atestfun2)
>>
>> require(rbenchmark)
>> benchmark(atestfun("a"),
>>           atestfun2("a"),
>>           cfun("a"),
>>           cfun2("a"),
>>           replications = 100000)
>>
>> Which gives after 100,000 replications:
>>
>>             test replications elapsed relative
>> 1  atestfun("a")       100000    0.83    1.339
>> 2 atestfun2("a")       100000    0.62    1.000
>> 3      cfun("a")       100000    0.81    1.306
>> 4     cfun2("a")       100000    0.62    1.000
>>
>> The patch can in principle make similar code marginally faster, but I'm
>> not convinced the patch is going to make any real difference except for in
>> some very specific and exotic cases. Even more, calling a function like the
>> examples inside the loop is the only way I can come up with where this
>> might be a problem. If you just construct the string inside the loop,
>> there's two possibilities:
>>
>> - the string does not need to change, and then you better construct it
>> outside of the loop
>> - the string does need to change, and then you need paste() or paste0()
>> anyway
>>
>> I'm not against incorporating the patch, as it would eliminate a few
>> keystrokes. It's a neat idea, but I don't expect any other noticeable
>> advantage from it.
>>
>> my humble 2 cents
>> Cheers
>> Joris
>>
>> On Wed, Jun 14, 2017 at 2:00 PM, Mark van der Loo <
>> mark.vanderloo at gmail.com> wrote:
>>
>>> Having some line-breaking character for string literals would have
>>> benefits
>>> as string literals can then be constructed parse-time rather than
>>> run-time.
>>> I have run into this myself a few times as well. One way to at least
>>> emulate something like that is the following.
>>>
>>> `%+%` <- function(x,y) paste0(x,y)
>>>
>>> "hello" %+%
>>>   " pretty" %+%
>>>   " world"
>>>
>>>
>>> -Mark
>>>
>>>
>>>
>>> Op wo 14 jun. 2017 om 13:53 schreef Andreas Kersting <
>>> r-devel at akersting.de>:
>>>
>>> > On Wed, 14 Jun 2017 06:12:09 -0500, Duncan Murdoch <
>>> > murdoch.duncan at gmail.com> wrote:
>>> >
>>> > > On 14/06/2017 5:58 AM, Andreas Kersting wrote:
>>> > > > Hi,
>>> > > >
>>> > > > I would really like to have a way to split long string literals
>>> across
>>> > > > multiple lines in R.
>>> > >
>>> > > I don't understand why you require the string to be a literal.  Why
>>> not
>>> > > construct the long string in an expression like
>>> > >
>>> > >   paste0("aaa",
>>> > >          "bbb")
>>> > >
>>> > > ?  Surely the execution time of the paste0 call is negligible.
>>> > >
>>> > > Duncan Murdoch
>>> >
>>> > Actually "execution time" is precisely one of the reasons why I would
>>> like
>>> > to see this feature as - depending on the context (e.g. in a tight
>>> loop) -
>>> > the execution time of paste0 (or probably also glue, thanks Gabor) is
>>> not
>>> > necessarily insignificant.
>>> >
>>> > The other reason is style: I think it is cleaner if we can construct
>>> such
>>> > a long string literal without the need for a function call.
>>> >
>>> > Andreas
>>> >
>>> > > >
>>> > > > Currently, if a string literal spans multiple lines, there is no
>>> way to
>>> > > > inhibit the introduction of newline characters:
>>> > > >
>>> > > >  > "aaa
>>> > > > + bbb"
>>> > > > [1] "aaa\nbbb"
>>> > > >
>>> > > >
>>> > > > If a line ends with a backslash, it is just ignored:
>>> > > >
>>> > > >  > "aaa\
>>> > > > + bbb"
>>> > > > [1] "aaa\nbbb"
>>> > > >
>>> > > >
>>> > > > We could use this fact to implement string splitting in a fairly
>>> > > > backward-compatible way, since currently such trailing backslashes
>>> > > > should hardly be used as they do not have any effect. The attached
>>> > patch
>>> > > > makes the parser ignore a newline character directly following a
>>> > backslash:
>>> > > >
>>> > > >  > "aaa\
>>> > > > + bbb"
>>> > > > [1] "aaabbb"
>>> > > >
>>> > > >
>>> > > > I personally would also prefer if leading blanks (spaces and tabs)
>>> in
>>> > > > the second line are ignored to allow for proper indentation:
>>> > > >
>>> > > >  >   "aaa \
>>> > > > +    bbb"
>>> > > > [1] "aaa bbb"
>>> > > >
>>> > > >  >   "aaa\
>>> > > > +    \ bbb"
>>> > > > [1] "aaa bbb"
>>> > > >
>>> > > > This is also implemented by this patch.
>>> > > >
>>> > > >
>>> > > > An alternative approach could be to have something like
>>> > > >
>>> > > > ("aaa "
>>> > > > "bbb")
>>> > > >
>>> > > > or
>>> > > >
>>> > > > ("aaa ",
>>> > > > "bbb")
>>> > > >
>>> > > > be interpreted as "aaa bbb".
>>> > > >
>>> > > > I don't know the ins and outs of the parser of R (hence: please
>>> very
>>> > > > carefully review the attached patch), but I guess this would be
>>> more
>>> > > > work to implement!?
>>> > > >
>>> > > >
>>> > > > What do you think? Is there anybody else who is missing this
>>> feature in
>>> > > > the first place?
>>> > > >
>>> > > > Regards,
>>> > > > Andreas
>>> > > >
>>> > > >
>>> > > >
>>> > > > ______________________________________________
>>> > > > R-devel at r-project.org mailing list
>>> > > > https://stat.ethz.ch/mailman/listinfo/r-devel
>>> > > >
>>> >
>>> > ______________________________________________
>>> > R-devel at r-project.org mailing list
>>> > https://stat.ethz.ch/mailman/listinfo/r-devel
>>> >
>>>
>>>         [[alternative HTML version deleted]]
>>
>>
>>>
>>> ______________________________________________
>>> R-devel at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>
>>
>>
>>
>> --
>> Joris Meys
>> Statistical consultant
>>
>> Ghent University
>> Faculty of Bioscience Engineering
>> Department of Mathematical Modelling, Statistics and Bio-Informatics
>>
>> tel :  +32 (0)9 264 61 79 <+32%209%20264%2061%2079>
>> Joris.Meys at Ugent.be
>> -------------------------------
>> Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php
>>
>


-- 
Joris Meys
Statistical consultant

Ghent University
Faculty of Bioscience Engineering
Department of Mathematical Modelling, Statistics and Bio-Informatics

tel :  +32 (0)9 264 61 79
Joris.Meys at Ugent.be
-------------------------------
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

	[[alternative HTML version deleted]]



More information about the R-devel mailing list