[Rd] R_parseVector and syntax error [was: error messages while parsing with rniParse]

Fri Jun 19 09:18:51 CEST 2009

Duncan Murdoch wrote:
>
> Romain Francois wrote:
>> Duncan Murdoch wrote:
>>  
>>> Simon Urbanek wrote:
>>>    
>>>> On Jun 18, 2009, at 17:02 , Duncan Murdoch wrote:
>>>>  
>>>>      
>>>>> Romain Francois wrote:
>>>>>           
>>>>>> Hello,
>>>>>>
>>>>>> [I'm redirecting this here from stats-rosuda-devel]
>>>>>>
>>>>>> When parsing R code through R_parseVector and the code generates 
>>>>>> an  error (syntax error), is there a way to grab the error.
>>>>>> It looks like yyerror populates the buffer "R_ParseErrorMsg", 
>>>>>> but  then the variable is not part of the public api.
>>>>>>
>>>>>> Would it be possible to add yet another entry point to the 
>>>>>> parser  that would basically wrap R_parseVector so that it would 
>>>>>> have an  extra char* argument that would bring back the error 
>>>>>> message if  there is an error?
>>>>>>
>>>>>>
>>>>>>                 
>>>>> I would oppose that.  Suggest ways to reduce the complexity of 
>>>>> the  parser interface and I'd be interested.  It's a nightmare to 
>>>>> make  any changes there.
>>>>>
>>>>> You can always call the R function wrapped in try(), so it's not 
>>>>> as  though this would give you anything that you don't already 
>>>>> have  access to.
>>>>>             
>>>> I'm not quite following - we're talking about R_ParseVector in C 
>>>> code  so the point is that the C code gets access to the error 
>>>> message so it  can relay it to the user.       
>>> I understood that.  But the C code can get the error message by 
>>> evaluating an R expression and looking at the result.
>>>
>>>    
>>>> There are no R-level functions involved  here. The issue here for 
>>>> the moment is that this information is  retrievable at R level but 
>>>> not (officially) at the C level.       
>>> I wouldn't mind exposing the underlying information in a clean way, 
>>> but the string in R_ParseVector isn't all a front end should get.
>>>     
>>
>> Great. Let's do that.
>> Is a function that simply returns some of the static variables used 
>> by bison clean enough ?
>>   
> It could be.   I'd like a design that allows for the possibility of 
> multiple syntax errors to be reported.  I have parse_Rd doing that, 
> though not committed yet.  parse() is different because we have to be 
> less tolerant of errors in R code than in Rd files.  But we could 
> still report multiple errors in one parse, not just stop at the first 
> one.

This is an interesting problem. Just being curious here: how do you 
continue parsing after a syntax error in parse ? Does it depend on the 
kind of syntax error ? Do you use some of the recovery protocols of 
bison (the special "error" token only appears in the very top level prog 
symbol :

prog    :    END_OF_INPUT            { return 0; }
    |    '\n'                { return xxvalue(NULL,2,NULL); }
    |    expr_or_assign '\n'            { return xxvalue($1,3,&@1); }
    |    expr_or_assign ';'            { return xxvalue($1,4,&@1); }
    |    error                 { YYABORT; }
    ;

Anyway, what about using the extra information to structure an error 
message of a custom condition class.

>
> Duncan Murdoch
>
>>> At the time of an R_ParseVector syntax error, the parser knows what 
>>> token it couldn't handle, and it knows its classification, and the 
>>> location in the file where it came from.   Not all of that makes it 
>>> through to the error message.
>>>    
>>>> As for  reducing complexity - technically, there is no complexity 
>>>> added since  all this is already in place ... [adding extra char * 
>>>> argument to  ParseVector may not be the best way but that's not 
>>>> what I'm arguing  for].         
>>> It was what I was arguing against.
>>>
>>> Duncan Murdoch
>>>
>>>    
>>>> Or am I missing something?
>>>>       Cheers,
>>>> S
>>>>
>>>>
>>>>  
>>>>      
>>>>>> Romain
>>>>>>
>>>>>> Simon Urbanek wrote:
>>>>>>
>>>>>>               
>>>>>>> On Jun 15, 2009, at 12:05 , Romain Francois wrote:
>>>>>>>
>>>>>>>
>>>>>>>                   
>>>>>>>> Hello,
>>>>>>>>
>>>>>>>> In JRI, is there a way to get the error message that is 
>>>>>>>> generated  by the
>>>>>>>> parser through rniParse
>>>>>>>> For example, if I have this :
>>>>>>>>
>>>>>>>> long y = re.rniParse( "rnorm( 10 ))", 1 ) ;
>>>>>>>>
>>>>>>>> this obviously generates a parse error, so y will be the same as
>>>>>>>> (R_NilValue) :
>>>>>>>>
>>>>>>>> long null_id = re.rniEval( re.rniParse( "NULL", 1 ), 0 ) ;
>>>>>>>>
>>>>>>>> I guess the underlying question is : "Is R_ParseErrorMsg 
>>>>>>>> exposed to
>>>>>>>> JRI".
>>>>>>>>
>>>>>>>>                         
>>>>>>> AFAICT R_ParseErrorMsg and friends are not exposed by the R API 
>>>>>>> -  they are not accessible outside, so they cannot be use by 
>>>>>>> JRI. It  would be nice if there was a way of accessing that 
>>>>>>> info, but R  doesn't currently support that.
>>>>>>>
>>>>>>> Cheers,
>>>>>>> Simon
>>>>>>>
>>>>>>>
>>>>>>>                   
>>>>>>>> The reason is I would like to bring back the message as part of an
>>>>>>>> exception generated when the code does not parse.
>>>>>>>>
>>>>>>>> Romain
>>>>>>>>
>>>>>>>>                         
>>>>>>                 
>>>>> ______________________________________________
>>>>> R-devel at r-project.org mailing list
>>>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>>>
>>>>>
>>>>>             
>>>
>>>     
>>
>>
>>   
>
>
>

-- 
Romain Francois
Independent R Consultant
+33(0) 6 28 91 30 30
http://romainfrancois.blog.free.fr