[Rd] Parsing code with newlines

Mikhail Titov m|t @end|ng |rom gmx@u@
Thu Apr 11 01:59:44 CEST 2019

On Wed, Apr 10, 2019 at  5:06 AM, Tomas Kalibera <tomas.kalibera using gmail.com> wrote:
>> This is my first post here. I came across the very same problem.
>> It can be reproduced within modified tests/Embedding/RParseEval.c
> Please check https://www.r-project.org/posting-guide.html and update
> your post if you still need to get help here - from your current post
> I am not sure what you did, what was the error you got and from which
> tool, why you think the error was a result of something not working
> correctly/as documented, etc. The original post with the same subject
> you are probably referring to had the same problem.

The original post is linked via e-mail headers however it goes back a
decade. It shows up linked as a thread alright in Gnus. Hence I thought
it would be alright to jump straight to the matter.

Here is the link to original discussion

At this point, I would like to report two bugs in "Writing R Extensions"
documentation. From that document it is not clear why line feeds (0x0A)
have to be removed from the input string to be parsed. Also nowhere in
that document it mentions R_TopLevelExec if parsing needs to be done in
the outer context. That is not when our C function is called from R, but
when we are trying to parse R code in C directly outside of main loop.
These are big show stoppers for newcomers.

The barely modified test code I had in my previous post, does not parse
what would seem a legit sample string "\r\n ls()". However, it does
parse alright "\n ls()". Nowhere in the docs the intolerance to line
feeds is mentioned. It is reproducible from R console as well.

,----[ R console session ]
| > parse(text="\r\n ls()")
| Error in parse(text = "\r\n ls()") : <text>:1:1: unexpected input
| 1:
|     ^
| >

Another problem with the aforementioned documentation is parsing
erroneous expressions like "deadbeef<-function(,bad){}" in top level
context. Instead of returning an error from parsing, it crashes
(with R_suicide) unless the call is wrapped in R_TopLevelExec.

> Please also note that "tests" (tests/Embedding/RParseEval.c) are not
> examples - if they do not catch R errors in some cases that is
> perfectly ok, they also may use internal API that is indeed not
> documented e.g. in Writing R Extensions.

Where would be a good example on top level context parsing then? I have
no problems skipping error checks and/or with the use of undocumented
functions. However I would rather prefer to avoid major unexpected
crashes. That example does NOT use any of the undocumented API and therefore is
misleading. I believe it SHOULD include R_TopLevelExec and that function
SHOULD be in the docs.

> Note Writing R Extensions has a section on embedding R and on cleanup
> handlers.

I have no problems with the rest of the document on embedding and clean
up in general.

>> Actually this example has another issue, namely it doesn't wrap
>> everything in R_ToplevelExec . This is a major show stopper for
>> newcomers as that function is barely mentioned anywhere and longjmp into
>> terminated setuploop function followed by R_suicide look like a mystery.
>> Error: bad value
>> Fatal error: unable to initialize the JIT
>> That aside, here is the code with newlines that fails to parse. I hope
>> it will paste alright here.
>> #include "embeddedRCall.h"
>> #include <R_ext/Parse.h>
>> int
>> main(int argc, char *argv[])
>> {
>>      SEXP e, tmp;
>>      int hadError;
>>      ParseStatus status;
>>      init_R(argc, argv);
>>      PROTECT(tmp = mkString("\n\r ls()"));
>>      PROTECT(e = R_ParseVector(tmp, 1, &status, R_NilValue));
>>      if (status != PARSE_OK)
>>      {
>>          printf("boo boo\n");
>>      }
>>      else
>>      {
>>          PrintValue(e);
>>          R_tryEval(VECTOR_ELT(e,0), R_GlobalEnv, &hadError);
>>      }
>>      UNPROTECT(2);
>>      end_R();
>>      return(0);
>> }


More information about the R-devel mailing list