[Rd] ALTREP: Bug reports

介非王 @zwj|08 @end|ng |rom gm@||@com
Thu May 16 23:35:16 CEST 2019


Hi,

Sorry for overflow the mailbox. Please ignore the second question, I
misunderstand Gabriel answer.

Best,
Jiefei

On Thu, May 16, 2019 at 5:29 PM 介非王 <szwjf08 using gmail.com> wrote:

> Hello Luke and Gabriel,
>
> Thank you very much for your quick responses. The explanation of STDVEC is
> very helpful and I appreciate it! For the wrapper, I have a few new
> questions.
>
>
> 1. Like Luke said a mutable object is not possible. However, I noticed
> that there is one extra argument *deep* in the function duplicate. I've
> googled all the available documentation for ALTREP but I did not find any
> explanation of it. Could you please give some detail on it?
>
>
> 2.
>
>> The first one correctly returns its internal data structure, but the
>> second
>> one returns the ALTREP object it wraps since the wrapper itself is an
>> ALTREP. This behavior is unexpected.
>
>
> I disagree. R_altrep_data1 returns whatever THAT altrep SEXP stores in its
>> "data1" part. There is no recursion/descent going on, and there shouldn't
>> be.
>
>
> This is might be a bug since in R release 3.6 it will return the ALTREP
> instead of the data of the ALTREP. I'm not sure if it has been fixed in
> 3.7. Here is a simple example:
>
> SEXP C_peekSharedMemory(SEXP x) {
>> while (ALTREP(x)) {
>> Rprintf("getting data 1\n");
>> x = R_altrep_data1(x);
>> }
>> return(x);
>> }
>
>
> If calling R_altrep_data1 return the internal data directly, we will only
> see one message. following my last example
>
> > .Internal(inspect(so1))
>> @0x0000000005e7fbb0 14 REALSXP g0c0 [MARK,NAM(7)]  Share object of type
>> double
>> > .Internal(inspect(so2))
>> @0x0000000005fc5ac0 14 REALSXP g0c0 [MARK,NAM(7)]  wrapper
>> [srt=-2147483648,no_na=0]
>>   @0x0000000005e7fbb0 14 REALSXP g0c0 [MARK,NAM(7)]  Share object of type
>> double
>> > sm1=peekSharedMemory(so1)
>> getting data 1
>> > sm2=peekSharedMemory(so2)
>> getting data 1
>> getting data 1
>
>
> We see that so2 call R_altrep_data1 twice to get the internal data. This
> is very unexpected.
>
> Thank you very much for your help again!
>
> Best,
> Jiefei
>
>
>
> On Thu, May 16, 2019 at 3:47 PM Gabriel Becker <gabembecker using gmail.com>
> wrote:
>
>> Hi Jiefei,
>>
>> Thanks for tryingout the ALTREP stuff and letting us know how it is
>> going. That said I don't think either of these are bugs, per se, but rather
>> a misunderstanding of the API. Details inline.
>>
>>
>>
>> On Thu, May 16, 2019 at 11:57 AM 介非王 <szwjf08 using gmail.com> wrote:
>>
>>> Hello,
>>>
>>> I have encountered two bugs when using ALTREP APIs.
>>>
>>> 1. STDVEC_DATAPTR
>>>
>>> From RInternal.h file it has a comment:
>>>
>>> /* ALTREP support */
>>> > void *(STDVEC_DATAPTR)(SEXP x);
>>>
>>>
>>> However, this comment might not be true, the easiest way to verify it is
>>> to
>>> define a C++ function:
>>>
>>>  void C_testFunc(SEXP a)
>>> > {
>>> > STDVEC_DATAPTR(a);
>>> > }
>>>
>>>
>>> and call it in R via
>>>
>>> > a=1:10
>>> > > C_testFunc(a)
>>> > Error in C_testFunc(a) : cannot get STDVEC_DATAPTR from ALTREP object
>>>
>>>
>> The STDVEC here refers to the SEXP not being an ALTREP. Anything that
>> starts with STDVEC should never receive an ALTREP, ie it should only be
>> called after non-ALTREPness has been confirmed by the surrounding/preceding
>> code. So this is expected behavior.
>>
>>
>>
>>
>>>
>>>  We can inspect the internal type and call ALTREP function to check if it
>>> is an ALTREP:
>>>
>>> > .Internal(inspect(a))
>>> > @0x000000001b5a3310 13 INTSXP g0c0 [NAM(7)]  1 : 10 (compact)
>>> > > #This is a wrapper of ALTREP
>>> > > is.altrep(a)
>>> > [1] TRUE
>>>
>>>
>>> I've also defined an ALTREP type and it did not work either. I guess this
>>> might be a bug? Or did I miss something?
>>>
>>> 2. Wrapper objects in ALTREP
>>>
>>> If the duplicate function is defined to return the object itself:
>>>
>>> SEXP vector_dulplicate(SEXP x, Rboolean deep) {
>>> return(x);
>>> }
>>>
>>
>> So this is a violation of of the contract. <youraltrep>_duplicate *must*
>> do an actual duplication. Returning the object unduplicated when duplicate
>> is called is going to have all sorts of unintended negative consequences.
>> R's internals rely on the fact that a SEXP that has been passed to
>> DUPLICATE has been duplciated and is safe to modify inplace.
>>
>>
>>
>>> In R an ALTREP object will behave like an environment
>>> (pass-by-reference).
>>> However, if we do something like(pseudo code):
>>>
>>> n=100
>>> > x=runif(n)
>>> > alt1=createAltrep(x)
>>> > alt2=alt1
>>> > alt2[1]=10
>>> > .Internal(inspect(alt1))
>>> > .Internal(inspect(alt2))
>>>
>>>
>>> The result would be:
>>>
>>> > .Internal(inspect(alt1))
>>> > @0x00000000156f4d18 14 REALSXP g0c0 [NAM(7)]
>>> > > .Internal(inspect(alt2 ))
>>> > @0x00000000156a33e0 14 REALSXP g0c0 [NAM(7)]  wrapper
>>> > [srt=-2147483648,no_na=0]
>>> >   @0x00000000156f4d18 14 REALSXP g0c0 [NAM(7)]
>>>
>>>
>>> It seems like the object alt2 automatically gets wrapped by R. Although
>>> at
>>> the R level it seems fine because there are no differences between alt1
>>> and
>>> alt2, if we define a C function as:
>>>
>>
>> So I'm not sure what is happening here, because it depends on what your
>> createAltrep function does. R automatically creates wrappers in some cases
>> but not nearly all (or even very many currently) cases.
>>
>>>
>>> SEXP C_peekSharedMemory(SEXP x) {
>>> > return(R_altrep_data1(x));
>>>
>>> }
>>>
>>>
>>> and call it in R to get the internal data structure of an ALTREP object.
>>>
>>> C_peekSharedMemory(alt1)
>>> > C_peekSharedMemory(alt2)
>>>
>>>
>>> The first one correctly returns its internal data structure, but the
>>> second
>>> one returns the ALTREP object it wraps since the wrapper itself is an
>>> ALTREP. This behavior is unexpected.
>>
>>
>> I disagree. R_altrep_data1 returns whatever THAT altrep SEXP stores in
>> its "data1" part. There is no recursion/descent going on, and there
>> shouldn't be.
>>
>>
>>> Since the dulplicate function returns
>>> the object itself, I will expect alt1 and alt2 should be the same object.
>>>
>>
>> Again, this is a violation of the core assumptions of ALTREP that is not
>> allowed, so I'd argue that any behavior this causes is largely irrelevant
>> (and a smart part of the much larger set of problems not duplicating when R
>> told you to duplicate will cause).
>>
>>
>>
>>
>>
>>
>>
>>> Even if they are essentially not the same, calling the same function
>>> should
>>> at least return the same result. Other than that, It seems like R does
>>> not
>>> always wrap an ALTREP object. If we change n from 100 to 10 and check the
>>> internal again, alt2 will not get wrapped.
>>
>>
>> Right, so this is a misunderstanding (which may be the fault of sparse
>> documentation on our part);  wrapper is one particular ALTREP class, its
>> not a fundamental aspect of ALTREPs themselves. Most ALTREP objects do not
>> have wrappers. See, e.g.,
>>
>> > .Internal(inspect(1:4))
>>
>> @7fb727d6be50 13 INTSXP g0c0 [NAM(3)]  1 : 4 (compact)
>>
>>
>> That's an ALTREP with no wrapper (a compact sequence). The wrapper ALTREP
>> class is for attaching metadata (known sortedness, known lack of NAs) to R
>> vectors. Its primary use currently is on the return value of sort().
>>
>>
>>> This makes the problem even more
>>> difficult since we cannot predict when would the wrapper appear.
>>>
>>
>> As currently factored, its not intended that you would be or need to
>> predict when a wrapper would appear. Using the C API or any R functions
>> will transparently treat wrapped and non-wrapped objects the same, and any
>> code you write should hit these API entrypoints so that any code you write
>> does the same.
>>
>> Does that help?
>>
>> Best,
>> ~G
>>
>>>
>>> Here is the source code for the wrapper:
>>> https://github.com/wch/r-source/blob/trunk/src/main/altclasses.c#L1399
>>>
>>> Here is a working example if one can build the sharedObject package from
>>> https://github.com/Jiefei-Wang/sharedObject
>>>
>>> n=100
>>> > x=runif(n)
>>> > so1=sharedObject(x,copyOnWrite = FALSE)
>>> > so2=so1
>>> > so2[1]=10
>>> > .Internal(inspect(so1))
>>> > .Internal(inspect(so2))
>>>
>>>
>>> Here is my session info:
>>>
>>> R version 3.6.0 alpha (2019-04-08 r76348)
>>> > Platform: x86_64-w64-mingw32/x64 (64-bit)
>>> > Running under: Windows >= 8 x64 (build 9200)
>>> > Matrix products: default
>>> > locale:
>>> > [1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United
>>> > States.1252
>>> > [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
>>> >
>>> > [5] LC_TIME=English_United States.1252
>>> > attached base packages:
>>> > [1] stats     graphics  grDevices utils     datasets  methods   base
>>> > other attached packages:
>>> > [1] sharedObject_0.0.99
>>> > loaded via a namespace (and not attached):
>>> > [1] compiler_3.6.0 tools_3.6.0    Rcpp_1.0.1
>>>
>>>
>>> Best,
>>> Jiefei
>>>
>>>         [[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> R-devel using r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>
>>

	[[alternative HTML version deleted]]



More information about the R-devel mailing list