[Rd] [ALTREP] What is the meaning of the return value of Is_sorted and No_NA function?

Wang Jiefei @zwj|08 @end|ng |rom gm@||@com
Wed Sep 11 20:49:13 CEST 2019


Hi Gabriel,

Thanks for your answer and future update plan. Somehow this email has been
delayed for a week, so there might be a wired reply from me saying that I
have found the answer from the R source code, it was sent from me last
week. Hopefully, this reply will not cost another week to post:)

As a side note, I like the idea that defining a macro for sortedness, and I
can see why we can only have a binary answer for NO_NA (since the return
value is actually bool). For making the code more readable, and for
possibly working with the future R release, is it possible to define a
macro for NO_NA function in RInternal.h? So if there is any change in NO_NA
function, there is no need to modify the code. Also, the code can be more
readable by doing that.

Best,
Jiefei

On Wed, Sep 11, 2019 at 1:58 PM Gabriel Becker <gabembecker using gmail.com>
wrote:

> Hi Jiefei,
>
> The meanings of the return values for sortedness can be found in
> RInternals.h, and are as follows:
>
> /* ALTREP sorting support */
> enum {SORTED_DECR_NA_1ST = -2,
>       SORTED_DECR = -1,
>       UNKNOWN_SORTEDNESS = INT_MIN, /*INT_MIN is NA_INTEGER! */
>       SORTED_INCR = 1,
>       SORTED_INCR_NA_1ST = 2,
>       KNOWN_UNSORTED = 0};
>
> The default value there is NA_INTEGER (ie INT_MIN), indicating that there
> is no sortedness information.
>
> Currently, *_NO_NA  effectively return a boolean, (even though the actual
> return value is int). This can be seen in the method we provide for compact
> sequences in altclasses.c:
>
>
> static int compact_intseq_No_NA(SEXP x)
> {
> #ifdef COMPACT_INTSEQ_MUTABLE
>     /* If the vector has been expanded it may have been modified. */
>     if (COMPACT_SEQ_EXPANDED(x) != R_NilValue)
> return FALSE;
> #endif
>     return TRUE;
> }
>
> (FALSE is a macro for 0, TRUE is a macro for 1).
>
> Think of the meaning of the return value to No_NA methods as the object's
> answer to the following question
>
> "Are you sure there are zero NAs in your data?"
>
> When it is sure of that, it  says "yes" (returning 1, ie TRUE). When it
> either is sure there are NAs *OR* doesn't have any information about
> whether there are NAs, it says "no" (returning 0, ie FALSE).
>
> Also please note, it is possible there may be another API point in the
> future which asks the object *how many NAs it has.∫ˆ* If that
> materializes, No_NA would just  consume the answer to thatto get the
> binarized version, but again there is nothing like that in there now.
>
> Hope that helps.
>
> Best,
> ~G
>
> On Wed, Sep 11, 2019 at 12:04 AM Wang Jiefei <szwjf08 using gmail.com> wrote:
>
>> Hi,
>>
>>
>>
>> I would like to figure out the meaning of the return value of these two
>> functions. Here are the default definitions I find from R source code:
>>
>>
>>
>> static int altreal_Is_sorted_default(SEXP x) { return UNKNOWN_SORTEDNESS;
>> }
>>
>> static int altreal_No_NA_default(SEXP x) { return 0; }
>>
>> I guess the macro *UNKNOWN_SORTEDNESS *in *Is_sorted* and 0 in *No_NA
>> *simply means
>> unknown sorted/NA status of the vector, so R will loop over the vector and
>> find the answer. However, what should we return in these functions to
>> indicate whether the vector has been sorted/ contains NA? My initial guess
>> is 0/1 but since *NA_NA *uses 0 as its default value so it will be
>> ambiguous. Are there any macros to define yes/no return values for these
>> functions? I would appreciate any thought here.
>>
>>
>>
>> Best,
>>
>> Jiefei
>>
>>         [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-devel using r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>

	[[alternative HTML version deleted]]



More information about the R-devel mailing list