[Rd] some questions about R internal SEXP types

Tue Sep 8 13:47:11 CEST 2020

On 9/8/20 1:13 PM, Dan Kortschak wrote:
> On Tue, 2020-09-08 at 12:08 +0200, Tomas Kalibera wrote:
>> I am not sure if I understand correctly, but if you were accessing
>> directly the memory of SEXPs from Go implementation instead of
>> calling
>> through exported access functions documented in WRE, that would be a
>> really bad idea. Of course fine for research and experimentation, but
>> the internal structure can and does change at any time, otherwise we
>> would not be able to develop nor maintain R. Such direct access
>> bypassing WRE would likely be a clear case for rejection in CRAN for
>> this interface and any packages using it, and I hope in other package
>> repositories as well.
> Sorry, I'm coming from a language that has strong backwards
> compatibility guarantees and (generally) machine level data types, so
> it is surprising to me that basic data types are that fluid.

Since R does not allow to do these things, it can change the object 
header without breaking compatibility.

In a managed language, it is certainly not typical to let native code 
extensions to access object headers directly, for safety, for allowing 
optimizations, due to synchronization, etc. In R, a recent optimization 
that would not have been possible otherwise, is the ALTREP framework.

Please don't use this list for advertising on other languages, there may 
be other lists for that.

>> However, I believe the overhead of calling the C-level access
>> functions
>> R exports should be minimal compared to other overheads. You can't
>> hope,
>> anyway, for being able to efficiently call tiny functions frequently
>> between Go and R. This can only work for bigger functions, anyway,
>> and
>> then the Go-C overhead should not be important.
> This really depends on the complexity/structure of the data structures
> that are being handed in to Go. The entirety of the tool is there to
> allow interchange of data between Go and R, in the case of atomic
> vectors, this cost is very cheap with direct access or via Cgo calling,
> however each name access or attribute access (both of which are
> necessary for struct population - and structs may come in slices) is a
> Cgo call; these look ups go from ~nanosecond to ~hundred nanoseconds
> per lookup.

Probably most data in R would be in vectors (as part of data frames), 
anyway. In some cases you may be able to cache the calls (some R objects 
are immutable, see WRE 5.9.10).

Tomas

>
>>> Note that there is a lot in WRE that's beyond what I want rgo to be
>>> able to do (calling in to R from Go for example). In fact, there's
>>> just
>>> a lot in WRE (it's almost 3 times the length of the Go language
>>> spec
>>> and memory model reference combined). The issues around weak
>>> references
>>> and external pointers are not something that I want to deal with;
>>> working with that kind of object is not idiomatic for Go (in fact
>>> without using C.malloc, R external pointers from Go would be
>>> forbidden
>>> by the Go runtime) and I would not expect that they are likely to
>>> be
>>> used by people writing extensions for R in Go.
>> Sure, I think it is perfectly fine to cover only a subset, if that is
>> already useful to write some extensions in Go. Maintenance would be
>> easiest if Go programs didn't call back into the R runtime at all, so
>> fewer calls the better for maintenance.
> This is apparently unavoidable though from what I read here.
>
>> Best
>> Tomas
>
> thanks
> Dan
>
>