[Rd] tools::parseLatex() crashes on "\\verb{}"

Antoine Fabri @nto|ne@|@br| @end|ng |rom gm@||@com
Fri Jul 21 15:14:09 CEST 2023


Surprisingly this invalid latex syntax is still formatted "right" in the
html output.
On a closer look it seems like roxygen2 introduces those, when using
markdown backtick quoting, if the quoted content is not syntactic. For
instance:

#' `c(c(1)`
#' `c(c(1))`

Will convert the first line to `\verb{c(c(1)}` and the second to
`\code{c(c(1))}`.

I've opened a ticket there FYI:
https://github.com/r-lib/roxygen2/issues/1503

------------------------------
>
> Message: 2
> Date: Thu, 20 Jul 2023 23:48:44 +0300
> From: Ivan Krylov <krylov.r00t using gmail.com>
> To: Antoine Fabri <antoine.fabri using gmail.com>
> Cc: R-devel <r-devel using r-project.org>
> Subject: Re: [Rd] tools::parseLatex() crashes on "\\verb{}"
> Message-ID: <20230720234844.004b13c2 using Tarkus>
> Content-Type: text/plain; charset="us-ascii"
>
> On Thu, 20 Jul 2023 21:41:44 +0200
> Antoine Fabri <antoine.fabri using gmail.com> wrote:
>
> > tools::parseLatex("\\verb{hello}")
> > # crashes the session
>
> Looking at the source [*], this seems to be happening because
> parseLatex expects the \verb macro to use the same character as the
> delimiter on both sides:
>
> tools::parseLatex('\\verb!hello!')
> # \verb!hello!
>
> What the loop doesn't have is a check for EOF, which leads TEXT_PUSH()
> to increase the temporary buffer exponentially until unsigned int
> nstext overflows and results in a 0-byte allocation, which is then
> overrun, corrupting the heap. Any other unterminated \verb!... would
> have caused the same crash.
>
> Here's a patch that prevents this particular crash:
>
> --- src/library/tools/src/gramLatex.y   (revision 84714)
> +++ src/library/tools/src/gramLatex.y   (working copy)
> @@ -846,8 +846,8 @@
>
>      TEXT_PUSH('\\'); TEXT_PUSH('v'); TEXT_PUSH('e'); TEXT_PUSH('r');
> TEXT_PUSH('b');
>      TEXT_PUSH(c);
> -    while ((c = xxgetc()) != delim) TEXT_PUSH(c);
> -    TEXT_PUSH(c);
> +    while (((c = xxgetc()) != delim) && c != R_EOF) TEXT_PUSH(c);
> +    if (c != R_EOF) TEXT_PUSH(c);
>
>      PRESERVE_SV(yylval = mkString2(stext, bp - stext));
>      if(stext != st0) free(stext);
>
> This seems to have been the only remaining while loop in gramLatex.y
> that didn't check for R_EOF.
>
> More correctness work is needed: mkMarkup() should avoid calling
> mkVerb(R_EOF) when running tools::parseLatex('\\verb'), since otherwise
> 0xFF becomes a part of the resulting text. All declarations of unsigned
> int nstext should probably be replaced by size_t nstext... but then
> we'd have an annoying visit from the OOM killer instead of a much faster
> crash in case of a runaway TEXT_PUSH(), and nobody expects to parse 4
> GB of LaTeX source anyway. TEXT_PUSH() probably needs an integer
> overflow check and to free the temporary buffer before calling error().
>
> --
> Best regards,
> Ivan
>
> [*]
>
> https://github.com/r-devel/r-svn/blob/f145419cc4dae162719206a61a29082adff2043d/src/library/tools/src/gramLatex.y#L845-L850
>

	[[alternative HTML version deleted]]



More information about the R-devel mailing list