[Rd] ALTREP: Bug reports

Gabriel Becker g@bembecker @end|ng |rom gm@||@com
Thu May 16 21:47:21 CEST 2019


Hi Jiefei,

Thanks for tryingout the ALTREP stuff and letting us know how it is going.
That said I don't think either of these are bugs, per se, but rather a
misunderstanding of the API. Details inline.



On Thu, May 16, 2019 at 11:57 AM 介非王 <szwjf08 using gmail.com> wrote:

> Hello,
>
> I have encountered two bugs when using ALTREP APIs.
>
> 1. STDVEC_DATAPTR
>
> From RInternal.h file it has a comment:
>
> /* ALTREP support */
> > void *(STDVEC_DATAPTR)(SEXP x);
>
>
> However, this comment might not be true, the easiest way to verify it is to
> define a C++ function:
>
>  void C_testFunc(SEXP a)
> > {
> > STDVEC_DATAPTR(a);
> > }
>
>
> and call it in R via
>
> > a=1:10
> > > C_testFunc(a)
> > Error in C_testFunc(a) : cannot get STDVEC_DATAPTR from ALTREP object
>
>
The STDVEC here refers to the SEXP not being an ALTREP. Anything that
starts with STDVEC should never receive an ALTREP, ie it should only be
called after non-ALTREPness has been confirmed by the surrounding/preceding
code. So this is expected behavior.




>
>  We can inspect the internal type and call ALTREP function to check if it
> is an ALTREP:
>
> > .Internal(inspect(a))
> > @0x000000001b5a3310 13 INTSXP g0c0 [NAM(7)]  1 : 10 (compact)
> > > #This is a wrapper of ALTREP
> > > is.altrep(a)
> > [1] TRUE
>
>
> I've also defined an ALTREP type and it did not work either. I guess this
> might be a bug? Or did I miss something?
>
> 2. Wrapper objects in ALTREP
>
> If the duplicate function is defined to return the object itself:
>
> SEXP vector_dulplicate(SEXP x, Rboolean deep) {
> return(x);
> }
>

So this is a violation of of the contract. <youraltrep>_duplicate *must* do
an actual duplication. Returning the object unduplicated when duplicate is
called is going to have all sorts of unintended negative consequences. R's
internals rely on the fact that a SEXP that has been passed to DUPLICATE
has been duplciated and is safe to modify inplace.



> In R an ALTREP object will behave like an environment (pass-by-reference).
> However, if we do something like(pseudo code):
>
> n=100
> > x=runif(n)
> > alt1=createAltrep(x)
> > alt2=alt1
> > alt2[1]=10
> > .Internal(inspect(alt1))
> > .Internal(inspect(alt2))
>
>
> The result would be:
>
> > .Internal(inspect(alt1))
> > @0x00000000156f4d18 14 REALSXP g0c0 [NAM(7)]
> > > .Internal(inspect(alt2 ))
> > @0x00000000156a33e0 14 REALSXP g0c0 [NAM(7)]  wrapper
> > [srt=-2147483648,no_na=0]
> >   @0x00000000156f4d18 14 REALSXP g0c0 [NAM(7)]
>
>
> It seems like the object alt2 automatically gets wrapped by R. Although at
> the R level it seems fine because there are no differences between alt1 and
> alt2, if we define a C function as:
>

So I'm not sure what is happening here, because it depends on what your
createAltrep function does. R automatically creates wrappers in some cases
but not nearly all (or even very many currently) cases.

>
> SEXP C_peekSharedMemory(SEXP x) {
> > return(R_altrep_data1(x));
>
> }
>
>
> and call it in R to get the internal data structure of an ALTREP object.
>
> C_peekSharedMemory(alt1)
> > C_peekSharedMemory(alt2)
>
>
> The first one correctly returns its internal data structure, but the second
> one returns the ALTREP object it wraps since the wrapper itself is an
> ALTREP. This behavior is unexpected.


I disagree. R_altrep_data1 returns whatever THAT altrep SEXP stores in its
"data1" part. There is no recursion/descent going on, and there shouldn't
be.


> Since the dulplicate function returns
> the object itself, I will expect alt1 and alt2 should be the same object.
>

Again, this is a violation of the core assumptions of ALTREP that is not
allowed, so I'd argue that any behavior this causes is largely irrelevant
(and a smart part of the much larger set of problems not duplicating when R
told you to duplicate will cause).







> Even if they are essentially not the same, calling the same function should
> at least return the same result. Other than that, It seems like R does not
> always wrap an ALTREP object. If we change n from 100 to 10 and check the
> internal again, alt2 will not get wrapped.


Right, so this is a misunderstanding (which may be the fault of sparse
documentation on our part);  wrapper is one particular ALTREP class, its
not a fundamental aspect of ALTREPs themselves. Most ALTREP objects do not
have wrappers. See, e.g.,

> .Internal(inspect(1:4))

@7fb727d6be50 13 INTSXP g0c0 [NAM(3)]  1 : 4 (compact)


That's an ALTREP with no wrapper (a compact sequence). The wrapper ALTREP
class is for attaching metadata (known sortedness, known lack of NAs) to R
vectors. Its primary use currently is on the return value of sort().


> This makes the problem even more
> difficult since we cannot predict when would the wrapper appear.
>

As currently factored, its not intended that you would be or need to
predict when a wrapper would appear. Using the C API or any R functions
will transparently treat wrapped and non-wrapped objects the same, and any
code you write should hit these API entrypoints so that any code you write
does the same.

Does that help?

Best,
~G

>
> Here is the source code for the wrapper:
> https://github.com/wch/r-source/blob/trunk/src/main/altclasses.c#L1399
>
> Here is a working example if one can build the sharedObject package from
> https://github.com/Jiefei-Wang/sharedObject
>
> n=100
> > x=runif(n)
> > so1=sharedObject(x,copyOnWrite = FALSE)
> > so2=so1
> > so2[1]=10
> > .Internal(inspect(so1))
> > .Internal(inspect(so2))
>
>
> Here is my session info:
>
> R version 3.6.0 alpha (2019-04-08 r76348)
> > Platform: x86_64-w64-mingw32/x64 (64-bit)
> > Running under: Windows >= 8 x64 (build 9200)
> > Matrix products: default
> > locale:
> > [1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United
> > States.1252
> > [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
> >
> > [5] LC_TIME=English_United States.1252
> > attached base packages:
> > [1] stats     graphics  grDevices utils     datasets  methods   base
> > other attached packages:
> > [1] sharedObject_0.0.99
> > loaded via a namespace (and not attached):
> > [1] compiler_3.6.0 tools_3.6.0    Rcpp_1.0.1
>
>
> Best,
> Jiefei
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

	[[alternative HTML version deleted]]



More information about the R-devel mailing list