[Rd] ALTREP wrappers and factors

Gabriel Becker g@bembecker @end|ng |rom gm@||@com
Fri Jul 19 04:00:03 CEST 2019


Hi Jiefei and Kylie,

Great to see people engaging with the ALTREP framework and identifying
places we may need more tooling. Comments inline.

On Thu, Jul 18, 2019 at 12:22 PM King Jiefei <szwjf08 using gmail.com> wrote:

>
> If that is the case and you are 100% sure the reference number should be 1
> for your variable *y*, my solution is to call *SET_NAMED *in C++ to reset
> the reference number. Note that you need to unbind your local variable
> before you reset the number. To return an unbound SEXP,  the C++ function
> should be placed at the end of your *matter:::as.altrep *function. I don't
> know if there is any simpler way to do that and I'll be happy to see any
> opinion.
>

So as far as I know, manually setting the NAMED value on any SEXP the
garbage collector is aware of is a direct violation of C-API contract and
not something that package code should ever be doing.

Its not at all clear to me that you can *ever* be 100% sure that the
reference number should be 1 when it is not currently one for an R object
that exists at the R-level (as opposed to only in pure C code). Sure, maybe
the object is created within the body of your R function instead of being
passed in, but what if someone is debugging your function and assigns the
value to the global environment using <<-  for later inspection; now  you
have an invalidly low NAMED value, ie you have a segfault coming. I know of
no way for you to prevent this or even know it has happened.



> On Thu, Jul 18, 2019 at 3:28 AM Bemis, Kylie <k.bemis using northeastern.edu>
> wrote:
>
> > Hello,
> >
> > I’m experimenting with ALTREP and was wondering if there is a preferred
> > way to create an ALTREP wrapper vector without using
> > .Internal(wrap_meta(…)), which R CMD check doesn’t like since it uses an
> > .Internal() function.
>

So there is the .doSortWrap  (and its currently inexplicably identical
clone .doWrap) function in base, which is an R level function that calls
down to .Internal(wrap_meta(...)), which you can use, but it doesn't look
general enough for what  I think you need (it was written for things that
have just been sorted, thus the name). Specifically, its not able to
indicate that things are of unknown sortedness as currently written.  If
matter vectors are guaranteed to be sorted for some reason, though, you can
use this. I'll talk to Luke about whether we want to generalize this, it
would be easy to have this support the full space of metadata for wrappers
and be a general purpose wrapper-maker, but that isn't what it is right now.

At the C-level, it looks like we do make R_tryWrap available (it appears in
Rinternals.h, and not within a USE_RINTERNALS section),so you can call that
from your own C(++) code. This creates a wrapper that has no metadata on it
(or rather it has metadata but  the metadata indicates that no special info
is known about the vector).

>
> > I was trying to create a factor that used an ALTREP integer, but
> > attempting to set the class and levels attributes always ended up
> > duplicating and materializing the integer vector. Using the wrapper
> avoided
> > this issue.
> >
> > Here is my initial ALTREP integer vector:
> >
> > > fc0 <- factor(c("a", "a", "b"))
> > >
> > > y <- matter::as.matter(as.integer(fc0))
> > > y <- matter:::as.altrep(y)
> > >
> > > .Internal(inspect(y))
> > @7fb0ce78c0f0 13 INTSXP g0c0 [NAM(7)] matter vector (mode=3, len=3,
> mem=0)
> >
> > Here is what I get without a wrapper:
> >
> > > fc1 <- structure(y, class="factor", levels=levels(x))
> > > .Internal(inspect(fc1))
> > @7fb0cae66408 13 INTSXP g0c2 [OBJ,NAM(2),ATT] (len=3, tl=0) 1,1,2
> > ATTRIB:
> >   @7fb0ce771868 02 LISTSXP g0c0 []
> >     TAG: @7fb0c80043d0 01 SYMSXP g1c0 [MARK,LCK,gp=0x4000] "class" (has
> > value)
> >     @7fb0c9fcbe90 16 STRSXP g0c1 [NAM(7)] (len=1, tl=0)
> >       @7fb0c80841a0 09 CHARSXP g1c1 [MARK,gp=0x61] [ASCII] [cached]
> > "factor"
> >     TAG: @7fb0c8004050 01 SYMSXP g1c0 [MARK,NAM(7),LCK,gp=0x4000]
> "levels"
> > (has value)
> >     @7fb0d1dd58c8 16 STRSXP g0c2 [MARK,NAM(7)] (len=2, tl=0)
> >       @7fb0c81bf4c0 09 CHARSXP g1c1 [MARK,gp=0x61] [ASCII] [cached] "a"
> >       @7fb0c90ba728 09 CHARSXP g1c1 [MARK,gp=0x61] [ASCII] [cached] "b"
> >
> > Here is what I get with a wrapper:
> >
> > > fc2 <- structure(.Internal(wrap_meta(y, 0, 0)), class="factor",
> > levels=levels(x))
> > > .Internal(inspect(fc2))
> > @7fb0ce764630 13 INTSXP g0c0 [OBJ,NAM(2),ATT]  wrapper [srt=0,no_na=0]
> >   @7fb0ce78c0f0 13 INTSXP g0c0 [NAM(7)] matter vector (mode=3, len=3,
> > mem=0)
> > ATTRIB:
> >   @7fb0ce764668 02 LISTSXP g0c0 []
> >     TAG: @7fb0c80043d0 01 SYMSXP g1c0 [MARK,LCK,gp=0x4000] "class" (has
> > value)
> >     @7fb0c9fcb010 16 STRSXP g0c1 [NAM(7)] (len=1, tl=0)
> >       @7fb0c80841a0 09 CHARSXP g1c1 [MARK,gp=0x61] [ASCII] [cached]
> > "factor"
> >     TAG: @7fb0c8004050 01 SYMSXP g1c0 [MARK,NAM(7),LCK,gp=0x4000]
> "levels"
> > (has value)
> >     @7fb0d1dd58c8 16 STRSXP g0c2 [MARK,NAM(7)] (len=2, tl=0)
> >       @7fb0c81bf4c0 09 CHARSXP g1c1 [MARK,gp=0x61] [ASCII] [cached] "a"
> >       @7fb0c90ba728 09 CHARSXP g1c1 [MARK,gp=0x61] [ASCII] [cached] "b"
> >
> > Is there a way to do this that doesn’t rely on .Internal() and won’t
> > produce R CMD check warnings?
> >
> > ~~~
> > Kylie Ariel Bemis
> > Khoury College of Computer Sciences
> > Northeastern University
> > kuwisdelu.github.io<https://kuwisdelu.github.io>
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >         [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-devel using r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
> >
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

	[[alternative HTML version deleted]]



More information about the R-devel mailing list