[R-pkg-devel] Replacement for SETLENGTH

Iris Simmons |kw@|mmo @end|ng |rom gm@||@com
Wed Jan 15 16:51:28 CET 2025


I don't think memcpy works well for VECSXP. The elements being overwritten
need to have their reference counts decreased and the new elements need to
have theirs increased.

Also, I don't entirely know how accurate everything I'm about to say is,
but I think you need to be using SET_TRUELENGTH and SET_GROWABLE_BIT along
with SETLENGTH. There's an example here:

https://github.com/wch/r-source/blob/744b5d34e1b8eb839e5d49d91ab21c1fe6800856/src/main/subassign.c#L257


The example uses SET_STDVEC_LENGTH which shouldn't be used, just replace it
with SETLENGTH.

So in your code, I'd replace:

SETLENGTH(modelspace, nUnique);

with

SET_GROWABLE_BIT(modelspace);
SET_TRUELENGTH(modelspace, nModels);
SETLENGTH(modelspace, nUnique);

On Wed, Jan 15, 2025, 10:30 Merlise Clyde, Ph.D. <clyde using duke.edu> wrote:

> Thanks for the added explanation Iris and Tomas!
>
> So looking at the code for xlengthgets, it does appear that I may take a
> memory hit for multiple large objects due to the second allocation before
> the old objects are possibly garbage collected.     There are about 12 such
> instances per function that are returned (I do use a counter for keeping
> track of the number of PROTECTED and to UNPROTECT for bookkeeping :-).
>  For memory limited machines, the alloc/copy was a problem for memory usage
> - and if I recall was one of the reasons in 2008 I switched to SETLENGTH,
> which doesn't seem to do an allocation ???  If there is going to be an
> absolute ban on SETLENGTH  in packages I'll probably need to address memory
> management differently for those cases.
>
> I did see a note before the function def'n of xlengthgets:
>
> /* (if it is vectorizable). We could probably be fairly */
> /* clever with memory here if we wanted to. */
>
> It would seem that memcpy would be more efficient for at least some of the
> types  (REALSPX, INTSPX) unless I am missing something - but any way to be
> more clever with VECSPX ?
>
> best,
> Merlise
>
>
>
> Merlise Clyde (she/her/hers)
> Professor of Statistical Science and Director of Graduate Studies
> Duke University
>
> ________________________________________
> From: Iris Simmons <ikwsimmo using gmail.com>
> Sent: Wednesday, January 15, 2025 1:00 AM
> To: Merlise Clyde, Ph.D. <clyde using duke.edu>
> Cc: r-package-devel using r-project.org <r-package-devel using r-project.org>
> Subject: Re: [R-pkg-devel] Replacement for SETLENGTH
>
> Hi Merlise!
>
>
> Referring to here:
>
>
> https://github.com/wch/r-source/blob/bb5a829466f77a3e1d03541747d149d65e900f2b/src/main/builtin.c#L834
>
> It seems as though the object is only re-used if the new length is
> equal to the old length.
>
> If you use Rf_lengthgets, you will need to protect the return value.
> The code you wrote that uses protect indexes looks correct, and the
> reprotect is good because you no longer need the old object.
>
> 2 is the correct amount to unprotect. PROTECT and PROTECT_WITH_INDEX
> (as far as I know) are the only functions that increase the size of
> the protect stack, and so the only calls that need to be unprotected.
> Typically, people define `int nprotect = 0;` at the start of their
> functions, add `nprotect++;` after each PROTECT and PROTECT_WITH_INDEX
> call, and add `UNPROTECT(nprotect);` immediately before each return or
> function end. That makes it easier to keep track.
>
> I typically use R_PreserveObject and R_ReleaseObject to protect
> objects without a need to bind them somewhere in my package's
> namespace. This would be that .onLoad() uses R_PreserveObject to
> protect some objects, and .onUnload uses R_ReleaseObject to release
> the protected objects. I probably would not use that for what you're
> describing.
>
>
> Regards,
>     Iris
>
> On Tue, Jan 14, 2025 at 11:26 PM Merlise Clyde, Ph.D. <clyde using duke.edu>
> wrote:
> >
> > I am trying to determine the best way to eliminate the use of SETLENGTH
> to truncate over allocated vectors in my package BAS to eliminate the NOTES
> about non-API calls in anticipation of R 4.5.0.
> >
> > From WRE:  "At times it can be useful to allocate a larger initial
> result vector and resize it to a shorter length if that is sufficient. The
> functions Rf_lengthgets and Rf_xlengthgets accomplish this; they are
> analogous to using length(x) <- n in R. Typically these functions return a
> freshly allocated object, but in some cases they may re-use the supplied
> object."
> >
> > it looks like using
> >
> >     x = Rf_lengthgets(x, newsize);
> >     SET_VECTOR_ELT(result, 0, x);
> >
> > before returning works to resize without a performance hit that incurs
> with a copy.  (will this always re-use the supplied object if newsize < old
> size?)
> >
> > There is no mention in section 5.9.2 about the need for re-protection of
> the object,  but it seems to be mentioned in some packages as well as a
> really old thread about SET_LENGTH that looks like a  non-API MACRO to
> lengthgets,
> >
> > indeed if I call gc() and then rerun my test I have had some
> non-reproducible aborts in R Studio on my M3 Mac (caught once in R -d lldb)
> >
> > Do I need to do something more like
> >
> > PROTECT_INDEX ipx0;.
> > PROTECT_WITH_INDEX(x0 = allocVector(REALSXP, old_size), &ipx0);
> >
> > PROTECT_INDEX ipx1;.
> > PROTECT_WITH_INDEX(x1 = allocVector(REALSXP, old_size), &ipx1);
> >
> > # fill in values in x0 and  x1up to new_size (random) < old_size
> > ...
> > REPROTECT(x0 = Rf_lengthgets(x0, new_size), ipx0);
> > REPROTECT(x1 = Rf_lengthgets(x1, new_size), ipx1);
> >
> > SET_VECTOR_ELT(result, 0, x0);
> > SET_VECTOR_ELT(result, 1, x1);
> > ...
> > UNPROTECT(2);   # or is this 4?
> > return(result);
> >
> >
> > There is also a mention in WRE of R_PreserveObject and R_ReleaseObject -
> >
> > looking for advice if this is needed, or which approach is better/more
> stable to replace SETLENGTH?   (I have many many instances that need to be
> updated, so trying to get some clarity here before updating and running
> code through valgrind or other sanitizers to catch any memory issues before
> submitting an update to CRAN.
> >
> > best,
> > Merlise
> >
> >
> >
> >
> >
> >
> >
> > ______________________________________________
> > R-package-devel using r-project.org mailing list
> >
> https://urldefense.com/v3/__https://stat.ethz.ch/mailman/listinfo/r-package-devel__;!!OToaGQ!ohDoxcAn5uIC25d42XhBz8Kd4YftOJDBoEW1NK9FOmgZpcmv0XIy5fQRm24-s_D8m9O_lR6jo6FcKiA$

	[[alternative HTML version deleted]]



More information about the R-package-devel mailing list