[Rd] [External] Re: changes in R-devel and zero-extent objects in Rcpp
Hiroaki Yutani
yut@n|@|n| @end|ng |rom gm@||@com
Sun Jun 9 07:13:51 CEST 2024
Sorry to ask about a bit drifted topic, but will there be an alternative
API to DATAPTR?
> DATAPTR is not in the API and can't be at least in this form
I believe it's vital for ALTREP to return the pointer to the expanded
version of a SEXP just like the implementation in base R does [1].
At least, VECSXP has no other measure to expose the pointer if I understand
correctly.
Best,
Yutani
[1]:
https://github.com/r-devel/r-svn/blob/a3508b75d28164b0e5bcb2c87f816ce5169729a4/src/main/altclasses.c#L186
2024年6月9日(日) 10:43 luke-tierney--- via R-devel <r-devel using r-project.org>:
> On Sat, 8 Jun 2024, Ben Bolker wrote:
>
> > The ASAN errors occur *even if the zero-length object is not actually
> > accessed*/is used in a perfectly correct manner, i.e. it's perfectly
> legal in
> > base R to define `m <- numeric(0)` or `m <- matrix(nrow = 0, ncol = 0)`,
> > whereas doing the equivalent in Rcpp will (now) lead to an ASAN error.
> >
> > i.e., these are *not* previously cryptic out-of-bounds accesses that
> are
> > now being revealed, but instead sensible and previously legal
> definitions of
> > zero-length objects that are now causing problems.
> >
> > I'm pretty sure I'm right about this, but it's absolutely possible
> that
> > I'm just confused at this point; I don't have a super-simple example to
> show
> > you at the moment. The closest is this example by Mikael Jagan:
> > https://github.com/lme4/lme4/issues/794#issuecomment-2155093049
> >
> > which shows that if x is a pointer to a zero-length vector (in plain
> C++
> > for R, no Rcpp is involved), DATAPTR(x) and REAL(x) evaluate to
> different
> > values.
> >
> > Mikael further points out that "Rcpp seems to cast a (void *) returned
> by
> > DATAPTR to (double *) when constructing a Vector<REALSXP> from a SEXP,
> rather
> > than using the (double *) returned by REAL." So perhaps R-core doesn't
> want
> > to guarantee that these operations give identical answers, in which case
> Rcpp
> > will have to change the way it does things ...
>
> It looks like REAL and friends should also get this check, but it's
> not high priority at this point, at least to me. DATAPTR has been
> using this check for a while in a barrier build, so you might want to
> test there as well. I expect we will activate more integrity checks
> from the barrier build on the API client side as things are tidied up.
>
> However: DATAPTR is not in the API and can't be at least in this form:
> It allows access to a writable pointer to STRSXP and VECSXP data and
> that is too dangerous for memory manager integrity. I'm not sure
> exactly how this will be resolve, but be prepared for changes.
>
> Best,
>
> luke
>
> >
> > cheers
> > Ben
> >
> >
> >
> > On 2024-06-08 6:39 p.m., Kevin Ushey wrote:
> >> IMHO, this should be changed in both Rcpp and downstream packages:
> >>
> >> 1. Rcpp could check for out-of-bounds accesses in cases like these, and
> >> emit an R warning / error when such an access is detected;
> >>
> >> 2. The downstream packages unintentionally making these out-of-bounds
> >> accesses should be fixed to avoid doing that.
> >>
> >> That is, I think this is ultimately a bug in the affected packages, but
> >> Rcpp could do better in detecting and handling this for client packages
> >> (avoiding a segfault).
> >>
> >> Best,
> >> Kevin
> >>
> >>
> >> On Sat, Jun 8, 2024, 3:06 PM Ben Bolker <bbolker using gmail.com
> >> <mailto:bbolker using gmail.com>> wrote:
> >>
> >>
> >> A change to R-devel (SVN r86629 or
> >>
> https://github.com/r-devel/r-svn/commit/92c1d5de23c93576f55062e26d446feface07250
> >> <
> https://github.com/r-devel/r-svn/commit/92c1d5de23c93576f55062e26d446feface07250
> >
> >> has changed the handling of pointers to zero-length objects,
> leading to
> >> ASAN issues with a number of Rcpp-based packages (the commit message
> >> reads, in part, "Also define STRICT_TYPECHECK when compiling
> >> inlined.c.")
> >>
> >> I'm interested in discussion from the community.
> >>
> >> Details/diagnosis for the issues in the lme4 package are here:
> >> https://github.com/lme4/lme4/issues/794
> >> <https://github.com/lme4/lme4/issues/794>,
> >> with a bit more discussion
> >> about how zero-length objects should be handled.
> >>
> >> The short(ish) version is that r86629 enables the
> >> CATCH_ZERO_LENGTH_ACCESS definition. This turns on the CHKZLN macro
> >> <
> https://github.com/r-devel/r-svn/blob/4ef83b9dc3c6874e774195d329cbb6c11a71c414/src/main/memory.c#L4090-L4104
> >> <
> https://github.com/r-devel/r-svn/blob/4ef83b9dc3c6874e774195d329cbb6c11a71c414/src/main/memory.c#L4090-L4104
> >>,
> >> which returns a trivial pointer (rather than the data pointer that
> >> would
> >> be returned in the normal control flow) if an object has length 0:
> >>
> >> /* Attempts to read or write elements of a zero length vector will
> >> result in a segfault, rather than read and write random memory.
> >> Returning NULL would be more natural, but Matrix seems to
> assume
> >> that even zero-length vectors have non-NULL data pointers, so
> >> return (void *) 1 instead. Zero-length CHARSXP objects still
> have
> >> a
> >> trailing zero byte so they are not handled. */
> >>
> >> In the Rcpp context this leads to an inconsistency, where
> `REAL(x)`
> >> is a 'real' external pointer and `DATAPTR(x)` is 0x1, which in turn
> >> leads to ASAN warnings like
> >>
> >> runtime error: reference binding to misaligned address
> 0x000000000001
> >> for type 'const double', which requires 8 byte alignment
> >> 0x000000000001: note: pointer points here
> >>
> >> I'm in over my head and hoping for insight into whether this
> >> problem
> >> should be resolved by changing R, Rcpp, or downstream Rcpp packages
> ...
> >>
> >> cheers
> >> Ben Bolker
> >>
> >> ______________________________________________
> >> R-devel using r-project.org <mailto:R-devel using r-project.org> mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-devel
> >> <https://stat.ethz.ch/mailman/listinfo/r-devel>
> >>
> >
> >
>
> --
> Luke Tierney
> Ralph E. Wareham Professor of Mathematical Sciences
> University of Iowa Phone: 319-335-3386
> Department of Statistics and Fax: 319-335-3017
> Actuarial Science
> 241 Schaeffer Hall email: luke-tierney using uiowa.edu
> Iowa City, IA 52242 WWW: http://www.stat.uiowa.edu/
> ______________________________________________
> R-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
[[alternative HTML version deleted]]
More information about the R-devel
mailing list