[R-pkg-devel] R CMD checks URLs formatted for LaTeX instead of using the non-LaTeX URLs, and fails
Ralf Herold
r@||@hero|d @end|ng |rom gmx@net
Fri Jul 8 07:17:59 CEST 2022
Great, thanks --
using \out{} to wrap \href{} in .Rd for latex rendering does circumvent the issue that I reported, which is that .get_urls_from_Rd() is called such that it cannot handle \ifelse{} even though there is code in that R function for handling \ifelse{}.
The \out{} wrapper is useful, thanks.
Not sure if the issue or the code is obsolete, but thanks again and happy to close this thread.
Best,
Ralf
> Am 07.07.2022 um 22:41 schrieb Sebastian Meyer <seb.meyer using fau.de>:
>
> Am 05.07.22 um 19:56 schrieb Ralf Herold:
>> Thanks and I would like to define the follow-up actions:
>> 1) change function "writeURL" on line 207 to read, for example, url <- fsub("([%#&])", "\\\\1", url) in https://svn.r-project.org/R/trunk/src/library/tools/R/Rd2latex.R <https://svn.r-project.org/R/trunk/src/library/tools/R/Rd2latex.R> to always escape URLs as you mention.
>> How can this be moved forward? This seems R core code, thus needs to be reported in R Bugzilla by one of its members (I am not one): https://www.r-project.org/bugs.html#where-to-submit-bug-reports-and-patches <https://www.r-project.org/bugs.html#where-to-submit-bug-reports-and-patches>.
>
> A Bugzilla entry would have been nice for future reference but is no longer necessary. The Rd2latex() bug is now fixed in the development version of R (>= r82557) such that URLs with & or # characters can then also be used inside \tabular and give the same link as in the HTML version: \tabular{l}{\url{https://example.org/a&b#c}} should just work.
> In other words, Rd2latex() now correctly handles the input URL as 'verbatim' text (as specified in WRE Section 2.3), which also means that backslashes in the input that do not escape percent or braces (Rd specials) are preserved in the output (as was already the case for HTML).
>
> It is planned to port the fix to R-patched (future R 4.2.2).
>
> In package development I'd probably avoid such URLs inside \tabular until after that release. Otherwise, if you want to support building the PDF manual in current and future R, you'd need to use \out{} and do all the escaping there yourself, for example:
>
> \name{test}
> \title{test}
> \description{
> \tabular{l}{
> \ifelse{latex}{
> \out{\href{https://example.org/a\&b\#c}{link}}
> }{
> \href{https://example.org/a&b#c}{link}
> }
> }
> }
>
> AFAICS, the below points are obsolete.
>
> Best regards,
>
> Sebastian Meyer
>
>> Could someone from this list do this? Many thanks
>> 2) change function "url_db_from_package_Rd_db" to call ".get_urls_from_Rd" with parameter "ifdef = TRUE" on line 178 in https://svn.r-project.org/R/trunk/src/library/tools/R/urltools.R <https://svn.r-project.org/R/trunk/src/library/tools/R/urltools.R>. This will activate the existing code that is intended to handle ifdef{}{}{}. This seems important for the issue I have reported and beyond. Same procedure as above?
>> 3) change Rd.sty, not sure, 1) seems more relevant.
>> Please advise, thanks
>> Ralf
>>> Am 04.07.2022 um 00:08 schrieb Sebastian Meyer <seb.meyer using fau.de <mailto:seb.meyer using fau.de>>:
>>>
>>> Am 03.07.22 um 08:27 schrieb Ralf Herold:
>>>> Thanks Sebastian,
>>>> but not only hash, also ampersand in \href in a tabular environment does need to be escaped, otherwise it does not latex (example below). I was not aware it is a known limitation for .Rd files despite searching for it.
>>>
>>> I stumbled over that problem a while ago and found that the escaping issue for the hash symbol is documented in the hyperref manual (but currently not accounted for by Rd2latex):
>>>
>>>> The special characters # and ~ do *not* need to be escaped in any way (unless the command is used in the argument of another command).
>>>
>>> For example, this LaTeX code fails to compile:
>>> \emph{\href{https://example.org/# <https://example.org/#>}{hash}}
>>> In contrast, an ampersand would not need to be escaped in that LaTeX example.
>>>
>>> However, I can confirm that a LaTeX error results if an ampersand is used in a \href URL (but not in \url} that is passed to the special \Tabular LaTeX command from Rd.sty that is used by Rd2latex() for \tabular Rd input. Thank you for the heads-up.
>>>
>>> I think it would be good to improve Rd2latex() / Rd.sty for URLs in \tabular that contain & or # rather than require special LaTeX treatment in the Rd source. My preliminary testing shows that hyperref is happy if [&%#] are always escaped in URLs (sometimes it is not necessary but it also does not seem to hurt).
>>>
>>> Best regards,
>>>
>>> Sebastian Meyer
>>>
>>>> My use case (with eventually more meaningful query parameters and possibly anchors) would work if the existing R code block for handling \ifelse in urltools.R was activated as shown below, and this is my suggestion. How could I propose this?
>>>> Kind regards,
>>>> Ralf
>>>> \name{mre}
>>>> \title{mre}
>>>> \description{mre}
>>>> \details{
>>>> \tabular{l}{
>>>> \href{https://clinicaltrials.gov/ct2/results?cond=Infections&rslt=With <https://clinicaltrials.gov/ct2/results?cond=Infections&rslt=With> <https://clinicaltrials.gov/ct2/results?cond=Infections&rslt=With <https://clinicaltrials.gov/ct2/results?cond=Infections&rslt=With>>}{link}
>>>> }}
>>>> LaTeX errors:
>>>> ! Argument of \href using split has an extra }.
>>>> <inserted text>
>>>> \par
>>>> l.24 }
>>>> Runaway argument?
>>>> https://clinicaltrials.gov/ct2/results?cond=Infections\unskip <https://clinicaltrials.gov/ct2/results?cond=Infections\unskip> <https://clinicaltrials.gov/ct2/results?cond=Infections\unskip <https://clinicaltrials.gov/ct2/results?cond=Infections\unskip>> \hfil
>>>> ! Paragraph ended before \href using split was complete.
>>>> <to be read again>
>>>> \par
>>>> l.24 }
>>>> ! Extra }, or forgotten \endgroup.
>>>> <recently read> }
>>>>> Am 03.07.2022 um 01:51 schrieb Sebastian Meyer <seb.meyer using fau.de <mailto:seb.meyer using fau.de> <mailto:seb.meyer using fau.de <mailto:seb.meyer using fau.de>>>:
>>>>>
>>>>> Am 02.07.22 um 12:01 schrieb Ralf Herold:
>>>>>> Hello, in my package documentation I want to include URLs with query string parameters and anchors, within a table. A minimally reproducible example is this content in file "man/mre.Rd":
>>>>>> \name{mre}
>>>>>> \title{mre}
>>>>>> \description{mre}
>>>>>> \details{
>>>>>> \tabular{l}{
>>>>>> \ifelse{latex}{\href{https://clinicaltrials.gov/ct2/results?cond=Infections\&rslt=With\#tableTop <https://clinicaltrials.gov/ct2/results?cond=Infections\&rslt=With\#tableTop> <https://clinicaltrials.gov/ct2/results?cond=Infections\&rslt=With\#tableTop <https://clinicaltrials.gov/ct2/results?cond=Infections\&rslt=With\#tableTop>>}{latex link}}{\href{https://clinicaltrials.gov/ct2/results?cond=Infections&rslt=With#tableTop <https://clinicaltrials.gov/ct2/results?cond=Infections&rslt=With#tableTop> <https://clinicaltrials.gov/ct2/results?cond=Infections&rslt=With#tableTop <https://clinicaltrials.gov/ct2/results?cond=Infections&rslt=With#tableTop>>}{non-latex link}}
>>>>>> }}
>>>>>> The ifelse{}{}{} construct is necessary since ampersands in a table need to be escaped for LaTeX rendering.
>>>>>
>>>>> This is a red herring. Ampersands do *not* need to be escaped in \href URLs. The problem is the hash symbol, which needs to be escaped if \href is nested within another markup macro, here \Tabular (from Rd.sty). This is a known limitation; Rd2latex will probably do the escaping in the future. It's good to see a use case.
>>>>>
>>>>> I think currently the best solutions for you are to simply omit the #tableTop part in the LaTeX version or to not use such URLs inside a \tabular.
>>>>>
>>>>> Hope this helps.
>>>>> Best regards,
>>>>>
>>>>> Sebastian Meyer
>>>>>
>>>>>> Each of the following commands checks and renders the respective output correctly:
>>>>>> tools::checkRd("man/mre.Rd")
>>>>>> tools::Rd2txt("man/mre.Rd")
>>>>>> tools::Rd2latex("man/mre.Rd")
>>>>>> tools::Rd2HTML("man/mre.Rd")
>>>>>> system2("R", c("CMD", "Rd2pdf", "man/mre.Rd"))
>>>>>> However, rhub::check_for_cran() results in NOTES:
>>>>>> Found the following (possibly) invalid URLs:
>>>>>> URL: https://clinicaltrials.gov/ct2/results?cond=Infections\&rslt=With\#tableTop <https://clinicaltrials.gov/ct2/results?cond=Infections\&rslt=With\#tableTop> <https://clinicaltrials.gov/ct2/results?cond=Infections\&rslt=With\#tableTop <https://clinicaltrials.gov/ct2/results?cond=Infections\&rslt=With\#tableTop>>
>>>>>> From: man/mre.Rd
>>>>>> Status: 400
>>>>>> Message: Bad Request
>>>>>> Subsequently, CRAN maintainers refused accepting the package.
>>>>>> However, the underlying cause is that, during such checks, all apparent URLs are extracted from .Rd files, irrespective of any \ifelse{}{}{} constructs. This in turn is due to such checks involving calls to function ".get_urls_from_Rd" without setting its argument "ifdef" to TRUE.
>>>>>> Here is how to see this behaviour:
>>>>>> db <- tools::Rd_db(dir = ".")
>>>>>> # get functions
>>>>>> source("https://svn.r-project.org/R/trunk/src/library/tools/R/urltools.R <https://svn.r-project.org/R/trunk/src/library/tools/R/urltools.R> <https://svn.r-project.org/R/trunk/src/library/tools/R/urltools.R <https://svn.r-project.org/R/trunk/src/library/tools/R/urltools.R>>")
>>>>>> source("https://svn.r-project.org/R/trunk/src/library/tools/R/utils.R <https://svn.r-project.org/R/trunk/src/library/tools/R/utils.R> <https://svn.r-project.org/R/trunk/src/library/tools/R/utils.R <https://svn.r-project.org/R/trunk/src/library/tools/R/utils.R>>")
>>>>>> .Rd_deparse <- tools:::.Rd_deparse
>>>>>> RdTags <- tools:::RdTags
>>>>>> # default, leading to invalid url in [1]
>>>>>> # > .get_urls_from_Rd(db)
>>>>>> # [1] "https://clinicaltrials.gov/ct2/results?cond=Infections\\&rslt=With\\#tableTop <https://clinicaltrials.gov/ct2/results?cond=Infections\\&rslt=With\\#tableTop> <https://clinicaltrials.gov/ct2/results?cond=Infections\\&rslt=With\\#tableTop <https://clinicaltrials.gov/ct2/results?cond=Infections\\&rslt=With\\#tableTop>>"
>>>>>> # [2] "https://clinicaltrials.gov/ct2/results?cond=Infections&rslt=With#tableTop <https://clinicaltrials.gov/ct2/results?cond=Infections&rslt=With#tableTop> <https://clinicaltrials.gov/ct2/results?cond=Infections&rslt=With#tableTop <https://clinicaltrials.gov/ct2/results?cond=Infections&rslt=With#tableTop>>"
>>>>>> # returning relevant valid url
>>>>>> #> .get_urls_from_Rd(db, ifdef = TRUE)
>>>>>> # [1] "https://clinicaltrials.gov/ct2/results?cond=Infections&rslt=With#tableTop <https://clinicaltrials.gov/ct2/results?cond=Infections&rslt=With#tableTop> <https://clinicaltrials.gov/ct2/results?cond=Infections&rslt=With#tableTop <https://clinicaltrials.gov/ct2/results?cond=Infections&rslt=With#tableTop>>"
>>>>>> This can be addressed by either:
>>>>>> -- changing the signature of ".get_urls_from_Rd" in line 50 in https://svn.r-project.org/R/trunk/src/library/tools/R/urltools.R <https://svn.r-project.org/R/trunk/src/library/tools/R/urltools.R> <https://svn.r-project.org/R/trunk/src/library/tools/R/urltools.R <https://svn.r-project.org/R/trunk/src/library/tools/R/urltools.R>> to read "ifdef = TRUE". Of note, this function has a code block to handle such ifdef constructs which indicates it should be possible to use them in Rd files.
>>>>>> -- changing the calling function "url_db_from_package_Rd_db" to include "ifdef = TRUE" on line 178 in https://svn.r-project.org/R/trunk/src/library/tools/R/urltools.R <https://svn.r-project.org/R/trunk/src/library/tools/R/urltools.R> <https://svn.r-project.org/R/trunk/src/library/tools/R/urltools.R <https://svn.r-project.org/R/trunk/src/library/tools/R/urltools.R>>
>>>>>> Please advise how to advance on this issue, thank you very much.
>>>>>> Greetings
>>>>>> Ralf
>>>>>> ______________________________________________
>>>>>> R-package-devel using r-project.org <mailto:R-package-devel using r-project.org> <mailto:R-package-devel using r-project.org <mailto:R-package-devel using r-project.org>> mailing list
>>>>>> https://stat.ethz.ch/mailman/listinfo/r-package-devel <https://stat.ethz.ch/mailman/listinfo/r-package-devel> <https://stat.ethz.ch/mailman/listinfo/r-package-devel <https://stat.ethz.ch/mailman/listinfo/r-package-devel>>
>
[[alternative HTML version deleted]]
More information about the R-package-devel
mailing list