[Rd] The regular expressions in compareVersion()

Simon Urbanek simon.urbanek at r-project.org
Fri Apr 25 04:27:27 CEST 2014


FWIW the link has a long thread that is 90% irrelevant - AFAICS the relevant part is

From: Yihui Xie-2
Sep 02, 2013; 4:11pm
Re: Sweave: printing an underscore in the output from an R command
[...]
Now you are good at the regular expression level, but Sweave comes and 
bites you, and that is due to this bug in the regular expression in 
Sweave Noweb syntax: 

> SweaveSyntaxNoweb$docexpr 
[1] "\\\\Sexpr\\{([^\\}]*)\\}" 

It should have been "\\\\Sexpr\\{([^}]*)\\}", i.e. } does not need to 
be escaped inside [], and \\ will be interpreted literally inside []. 
In your case, Sweave sees \ in \Sexpr{}, and the regular expression 
stops matching there, and is unable to see } after \, so it believes 
there is no inline R expressions in your document. 


On Apr 24, 2014, at 10:15 PM, Yihui Xie <xie at yihui.name> wrote:

> You are right that this is unlikely to cause problems, because users
> are unlikely to put backslashes in version numbers. Henrik has pointed
> out the problem. It is not about "making the source code a little
> cleaner", but "making it correct". Either someone in R core corrects
> the wrong regular expressions in a few seconds (unless you think \ can
> be a legal character in a version number), or I just give up the
> report. It seems the latter is easier. It is not worth additional
> Q&A's back and forth.
> 
> Regarding the regular expression problem for \Sexpr{} in Sweave,
> please see here for a record:
> http://r.789695.n4.nabble.com/Sweave-printing-an-underscore-in-the-output-from-an-R-command-td4675177.html
> As I said, it is a similar problem: someone tried to escape a
> character that did not need to be escaped in [].
> 
> Regards,
> Yihui
> --
> Yihui Xie <xieyihui at gmail.com>
> Web: http://yihui.name
> 
> 
> On Thu, Apr 24, 2014 at 6:20 PM, Duncan Murdoch
> <murdoch.duncan at gmail.com> wrote:
>> On 24/04/2014, 5:26 PM, Henrik Bengtsson wrote:
>>> 
>>> On Thu, Apr 24, 2014 at 1:42 PM, Duncan Murdoch
>>> <murdoch.duncan at gmail.com> wrote:
>>>> 
>>>> On 24/04/2014, 1:11 PM, Yihui Xie wrote:
>>>>> 
>>>>> 
>>>>> Hi,
>>>>> 
>>>>> I guess the backslash should not be used as the separator for
>>>>> strsplit() in compareVersion(), because the period in [.] is no longer
>>>>> a metacharacter (no need to "escape" it using a backslash):
>>>>> 
>>>>> 
>>>>> https://github.com/wch/r-source/blob/trunk/src/library/utils/R/packages.R#L866-L867
>>>>> 
>>>>>> compareVersion
>>>>> 
>>>>> 
>>>>> function (a, b)
>>>>> {
>>>>> ....
>>>>>      a <- as.integer(strsplit(a, "[\\.-]")[[1L]])
>>>>>      b <- as.integer(strsplit(b, "[\\.-]")[[1L]])
>>>>> ....
>>>>> <environment: namespace:utils>
>>>> 
>>>> 
>>>> 
>>>> Could you post an example where this causes trouble, or are you just
>>>> suggesting this as a way to make the source a little cleaner?
>>> 
>>> 
>>> Maybe it's already clear, but [\\.] is the set for the two symbols '\'
>>> and '.', not '.' alone.  For example, I would expect an error below:
>>> 
>>>> compareVersion("3.14-59.26", "3.14-59\\26")
>>> 
>>> [1] 0
>>> 
>> 
>> How does that cause problems?
>> 
>> Duncan Murdoch
>> 
>> 
>>> /Henrik
>>> 
>>>> 
>>>> 
>>>>> 
>>>>> A similar regular expression problem also exists in the Sweave syntax
>>>>> (for \Sexpr{}), and I have reported it once. It was fixed but the fix
>>>>> was immediately reverted for some reason:
>>>>> 
>>>>> 
>>>>> https://github.com/wch/r-source/commit/52b0a46e15136a7f9e4777e9960fdda6d84880c0
>>>> 
>>>> 
>>>> 
>>>> A link to your report would be more useful, if it included an example
>>>> where
>>>> the bad regexp causes trouble.
>>>> 
>>>> Duncan Murdoch
> 
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
> 



More information about the R-devel mailing list