[Rd] R strings, null-terminated or size delimited?

Duncan Murdoch murdoch at stats.uwo.ca
Sun Nov 22 01:25:00 CET 2009


On 21/11/2009 6:31 PM, Guillaume Yziquel wrote:
> Simon Urbanek a écrit :
>> On Nov 21, 2009, at 4:12 PM, Guillaume Yziquel wrote:
>>
>>> Hello.
>>>
>>> I've been looking at vecsexps for my binding.
>>>
>>> Concerning strings, I'm wondering: are they supposed to be 
>>> null-delimited?
>> Yes, they are null-delimited when you create/access them.
> 
> OK. Fair enough. But is guaranteed that null-delimitation ends where the 
>   vecsxp field of the * VECSEXP tells where the R vector should end? Let 
> me rephrase that:
> 
> -1- Should I consider it a bug if the two informations differ?
> 
> -2- What's the "safest" way out of the two?
> 
>>> Are they delimited by the info in the SEXPHEADER macro in Rinternals.h?
>>
>> You should not be touching or reading that.
> 
> I believe I should. I'd like the OCaml / R binding to be closely knit to 
> R internals. One reason would be for speed, the other being that I'd 
> like to make use of camlp4 to write syntax extensions to mix OCaml and R 
> syntax. It's therefore important for me not to rely on the R interpreter 
> to be active when building R values. Or when marshaling R values via 
> OCaml. There are numerous other issues aside this one.

You are probably not going to be able to do that.  Take your example of 
the promise below:  to evaluate a promise, you need to evaluate the 
expression attached to it in the R interpreter.  (This is discussed in 
the R Language Definition.)

You can put probably put together simple R objects like integer arrays 
without having R running, but anything substantial isn't going to be 
feasible.

Duncan Murdoch

> 
> I'm already using #define USE_RINTERNALS in my .c file to inspect R values.
> 
>>> Basically, what are the macros or functions to access the values of 
>>> the vecsexps?
>> VECTOR_ELT and SET_VECTOR_ELT (assuming that you're referring to VECSXP 
>> which is are generic vectors).
> 
> No. I'm refering to INTSXP for now. But I see what you mean:
> 
>> #define INTEGER(x)      ((int *) DATAPTR(x))
>> #define VECTOR_ELT(x,i) ((SEXP *) DATAPTR(x))[i]
> 
> VECTOR_ELT is not suitable for INTSXP arrays. I need to convert to 
> INTSXP array to an OCaml list / array.
> 
>>> I'm thinking of CHARSXPs and INTSXPs for the moment...
>> Those are entirely different - CHARSXP are not vectors but strings (see 
>> mkChar et al., CHAR, ...) and INTSXP are integer arrays (in C speak) 
>> accessed using INTEGER.
> 
> OK. They're not vectors. They're VECTOR_SEXPRECs.
> 
>> Please read R-exts - it's better than guessing.
> 
> Funny, I have R-exts.pdf and R-ints.pdf opened. They're fine when it 
> comes to writing R extensions. Not when writing bindings embedding R 
> into OCaml so that you can beta-reduce isomorphically in R and OCaml.
> 
>> Cheers,
>> Simon
> 
> I'm already using heretic features in OCaml (namely Obj.magic) in order 
> to do this binding. I do not mind using heretic features of the R API.
> 
> I do not mean to be a pain, but I have to do what needs to be done. If I 
> find on my way that #define USE_RINTERNALS is overkill, I'll gladly drop it.
> 
> For instance, here's one of my issues: I've extracted the R SEXP for the 
> "str" symbol. It's a promise. Now, how do I map such a SEXP to an OCaml 
> function? Haven't found that in R-ints.pdf or R-exts.pdf. There's talk 
> about functions, but promises are somewhat overlooked. However, such a 
> mapping is crucial to me.
> 
> I was not guessing when I was trying to look at the internal structure 
> of R data. Simply trying to get a grip on how to execute promises, and 
> therefore examining such a promise:
> 
>> # R.Internal.Pretty.t_of_sexp (R.Raw.sexp_of_t (R.symbol "str"));;
>> - : R.Internal.Pretty.t =
>> PROMISE
>>  {value = SYMBOL None;
>>   expr =
>>    CALL (SYMBOL (Some ("lazyLoadDBfetch", BUILTIN)),
>>     [INT [105; 153119]; Unknown; Unknown; Unknown]);
>>   env = Unknown}
> 
> Or, following structures in Rinternals.h:
> 
>> # R.Internal.C.t_of_sexp (R.Raw.sexp_of_t (R.symbol "str"));;
>> - : R.Internal.C.t =
>> Val
>>  {content =
>>    PROMSXP
>>     {prom_value =
>>       Val
>>        {content =
>>          SYMSXP
>>           {pname = Val {content = NILSXP};
>>            sym_value = R.Internal.C.Recursive <lazy>;
>>            internal = Val {content = NILSXP}}};
>>      R.Internal.C.expr =
>>       Val
>>        {content =
>>          LANGSXP
>>           {carval =
>>             Val
>>              {content =
>>                SYMSXP
>>                 {pname = Val {content = CHARSXP "lazyLoadDBfetch"};
>>                  sym_value = Val {content = BUILTINSXP 687};
>>                  internal = Val {content = NILSXP}}};
>>            cdrval =
>>             Val
>>              {content =
>>                LISTSXP
>>                 {carval = Val {content = INTSXP [105; 153119]};
>>                  cdrval =
>>                   Val
>>                    {content =
>>                      LISTSXP
>>                       {carval =
>>                         Val
>>                          {content =
>>                            SYMSXP
>>                             {pname = Val {content = CHARSXP "datafile"};
>>                              sym_value =
>>                               Val
>>                                {content =
>>                                  SYMSXP
>>                                   {pname = Val {content = NILSXP};
>>                                    sym_value = R.Internal.C.Recursive <lazy>;
>>                                    internal = Val {content = NILSXP}}};
>>                              internal = Val {content = NILSXP}}};
>>                        cdrval =
>>                         Val
>>                          {content =
>>                            LISTSXP
>>                             {carval =
>>                               Val
>>                                {content =
>>                                  SYMSXP
>>                                   {pname =
>>                                     Val {content = CHARSXP "compressed"};
>>                                    sym_value =
>>                                     Val
>>                                      {content =
>>                                        SYMSXP
>>                                         {pname = Val {content = NILSXP};
>>                                          sym_value =
>>                                           R.Internal.C.Recursive <lazy>;
>>                                          internal = Val {content = NILSXP}}};
>>                                    internal = Val {content = NILSXP}}};
>>                              cdrval =
>>                               Val
>>                                {content =
>>                                  LISTSXP
>>                                   {carval =
>>                                     Val
>>                                      {content =
>>                                        SYMSXP
>>                                         {pname =
>>                                           Val {content = CHARSXP "envhook"};
>>                                          sym_value =
>>                                           Val
>>                                            {content =
>>                                              SYMSXP
>>                                               {pname = Val {content = NILSXP};
>>                                                sym_value =
>>                                                 R.Internal.C.Recursive <lazy>;
>>                                                internal =
>>                                                 Val {content = NILSXP}}};
>>                                          internal = Val {content = NILSXP}}};
>>                                    cdrval = Val {content = NILSXP};
>>                                    tagval = Val {content = NILSXP}}};
>>                              tagval = Val {content = NILSXP}}};
>>                        tagval = Val {content = NILSXP}}};
>>                  tagval = Val {content = NILSXP}}};
>>            tagval = Val {content = NILSXP}}};
>>      R.Internal.C.env = Val {content = ENVSXP}}}
>> # 
> 
> For instance, an issue I'd like advice on is: what does such a symbol mean?
> 
>>                            SYMSXP
>>                             {pname = Val {content = CHARSXP "datafile"};
>>                              sym_value =
>>                               Val
>>                                {content =
>>                                  SYMSXP
>>                                   {pname = Val {content = NILSXP};
>>                                    sym_value = R.Internal.C.Recursive <lazy>;
>>                                    internal = Val {content = NILSXP}}};
>>                              internal = Val {content = NILSXP}}};
> 
> And how is it treated when "str" is executed?
> 
> All the best.
>



More information about the R-devel mailing list