[Rd] R strings, null-terminated or size delimited?
Duncan Murdoch
murdoch at stats.uwo.ca
Sun Nov 22 01:25:00 CET 2009
On 21/11/2009 6:31 PM, Guillaume Yziquel wrote:
> Simon Urbanek a écrit :
>> On Nov 21, 2009, at 4:12 PM, Guillaume Yziquel wrote:
>>
>>> Hello.
>>>
>>> I've been looking at vecsexps for my binding.
>>>
>>> Concerning strings, I'm wondering: are they supposed to be
>>> null-delimited?
>> Yes, they are null-delimited when you create/access them.
>
> OK. Fair enough. But is guaranteed that null-delimitation ends where the
> vecsxp field of the * VECSEXP tells where the R vector should end? Let
> me rephrase that:
>
> -1- Should I consider it a bug if the two informations differ?
>
> -2- What's the "safest" way out of the two?
>
>>> Are they delimited by the info in the SEXPHEADER macro in Rinternals.h?
>>
>> You should not be touching or reading that.
>
> I believe I should. I'd like the OCaml / R binding to be closely knit to
> R internals. One reason would be for speed, the other being that I'd
> like to make use of camlp4 to write syntax extensions to mix OCaml and R
> syntax. It's therefore important for me not to rely on the R interpreter
> to be active when building R values. Or when marshaling R values via
> OCaml. There are numerous other issues aside this one.
You are probably not going to be able to do that. Take your example of
the promise below: to evaluate a promise, you need to evaluate the
expression attached to it in the R interpreter. (This is discussed in
the R Language Definition.)
You can put probably put together simple R objects like integer arrays
without having R running, but anything substantial isn't going to be
feasible.
Duncan Murdoch
>
> I'm already using #define USE_RINTERNALS in my .c file to inspect R values.
>
>>> Basically, what are the macros or functions to access the values of
>>> the vecsexps?
>> VECTOR_ELT and SET_VECTOR_ELT (assuming that you're referring to VECSXP
>> which is are generic vectors).
>
> No. I'm refering to INTSXP for now. But I see what you mean:
>
>> #define INTEGER(x) ((int *) DATAPTR(x))
>> #define VECTOR_ELT(x,i) ((SEXP *) DATAPTR(x))[i]
>
> VECTOR_ELT is not suitable for INTSXP arrays. I need to convert to
> INTSXP array to an OCaml list / array.
>
>>> I'm thinking of CHARSXPs and INTSXPs for the moment...
>> Those are entirely different - CHARSXP are not vectors but strings (see
>> mkChar et al., CHAR, ...) and INTSXP are integer arrays (in C speak)
>> accessed using INTEGER.
>
> OK. They're not vectors. They're VECTOR_SEXPRECs.
>
>> Please read R-exts - it's better than guessing.
>
> Funny, I have R-exts.pdf and R-ints.pdf opened. They're fine when it
> comes to writing R extensions. Not when writing bindings embedding R
> into OCaml so that you can beta-reduce isomorphically in R and OCaml.
>
>> Cheers,
>> Simon
>
> I'm already using heretic features in OCaml (namely Obj.magic) in order
> to do this binding. I do not mind using heretic features of the R API.
>
> I do not mean to be a pain, but I have to do what needs to be done. If I
> find on my way that #define USE_RINTERNALS is overkill, I'll gladly drop it.
>
> For instance, here's one of my issues: I've extracted the R SEXP for the
> "str" symbol. It's a promise. Now, how do I map such a SEXP to an OCaml
> function? Haven't found that in R-ints.pdf or R-exts.pdf. There's talk
> about functions, but promises are somewhat overlooked. However, such a
> mapping is crucial to me.
>
> I was not guessing when I was trying to look at the internal structure
> of R data. Simply trying to get a grip on how to execute promises, and
> therefore examining such a promise:
>
>> # R.Internal.Pretty.t_of_sexp (R.Raw.sexp_of_t (R.symbol "str"));;
>> - : R.Internal.Pretty.t =
>> PROMISE
>> {value = SYMBOL None;
>> expr =
>> CALL (SYMBOL (Some ("lazyLoadDBfetch", BUILTIN)),
>> [INT [105; 153119]; Unknown; Unknown; Unknown]);
>> env = Unknown}
>
> Or, following structures in Rinternals.h:
>
>> # R.Internal.C.t_of_sexp (R.Raw.sexp_of_t (R.symbol "str"));;
>> - : R.Internal.C.t =
>> Val
>> {content =
>> PROMSXP
>> {prom_value =
>> Val
>> {content =
>> SYMSXP
>> {pname = Val {content = NILSXP};
>> sym_value = R.Internal.C.Recursive <lazy>;
>> internal = Val {content = NILSXP}}};
>> R.Internal.C.expr =
>> Val
>> {content =
>> LANGSXP
>> {carval =
>> Val
>> {content =
>> SYMSXP
>> {pname = Val {content = CHARSXP "lazyLoadDBfetch"};
>> sym_value = Val {content = BUILTINSXP 687};
>> internal = Val {content = NILSXP}}};
>> cdrval =
>> Val
>> {content =
>> LISTSXP
>> {carval = Val {content = INTSXP [105; 153119]};
>> cdrval =
>> Val
>> {content =
>> LISTSXP
>> {carval =
>> Val
>> {content =
>> SYMSXP
>> {pname = Val {content = CHARSXP "datafile"};
>> sym_value =
>> Val
>> {content =
>> SYMSXP
>> {pname = Val {content = NILSXP};
>> sym_value = R.Internal.C.Recursive <lazy>;
>> internal = Val {content = NILSXP}}};
>> internal = Val {content = NILSXP}}};
>> cdrval =
>> Val
>> {content =
>> LISTSXP
>> {carval =
>> Val
>> {content =
>> SYMSXP
>> {pname =
>> Val {content = CHARSXP "compressed"};
>> sym_value =
>> Val
>> {content =
>> SYMSXP
>> {pname = Val {content = NILSXP};
>> sym_value =
>> R.Internal.C.Recursive <lazy>;
>> internal = Val {content = NILSXP}}};
>> internal = Val {content = NILSXP}}};
>> cdrval =
>> Val
>> {content =
>> LISTSXP
>> {carval =
>> Val
>> {content =
>> SYMSXP
>> {pname =
>> Val {content = CHARSXP "envhook"};
>> sym_value =
>> Val
>> {content =
>> SYMSXP
>> {pname = Val {content = NILSXP};
>> sym_value =
>> R.Internal.C.Recursive <lazy>;
>> internal =
>> Val {content = NILSXP}}};
>> internal = Val {content = NILSXP}}};
>> cdrval = Val {content = NILSXP};
>> tagval = Val {content = NILSXP}}};
>> tagval = Val {content = NILSXP}}};
>> tagval = Val {content = NILSXP}}};
>> tagval = Val {content = NILSXP}}};
>> tagval = Val {content = NILSXP}}};
>> R.Internal.C.env = Val {content = ENVSXP}}}
>> #
>
> For instance, an issue I'd like advice on is: what does such a symbol mean?
>
>> SYMSXP
>> {pname = Val {content = CHARSXP "datafile"};
>> sym_value =
>> Val
>> {content =
>> SYMSXP
>> {pname = Val {content = NILSXP};
>> sym_value = R.Internal.C.Recursive <lazy>;
>> internal = Val {content = NILSXP}}};
>> internal = Val {content = NILSXP}}};
>
> And how is it treated when "str" is executed?
>
> All the best.
>
More information about the R-devel
mailing list