[R] any and all
Duncan Murdoch
murdoch@dunc@n @end|ng |rom gm@||@com
Fri Apr 12 23:59:33 CEST 2024
On 12/04/2024 3:52 p.m., avi.e.gross using gmail.com wrote:
> Base R has generic functions called any() and all() that I am having trouble
> using.
>
> It works fine when I play with it in a base R context as in:
>
>> all(any(TRUE, TRUE), any(TRUE, FALSE))
> [1] TRUE
>> all(any(TRUE, TRUE), any(FALSE, FALSE))
> [1] FALSE
>
> But in a tidyverse/dplyr environment, it returns wrong answers.
>
> Consider this example. I have data I have joined together with pairs of
> columns representing a first generation and several other pairs representing
> additional generations. I want to consider any pair where at least one of
> the pair is not NA as a success. But in order to keep the entire row, I want
> all three pairs to have some valid data. This seems like a fairly common
> reasonable thing often needed when evaluating data.
>
> So to make it very general, I chose to do something a bit like this:
We can't really help you without a reproducible example. It's not
enough to show us something that doesn't run but is a bit like the real
code.
Duncan Murdoch
>
> result <- filter(mydata,
> all(
> any(!is.na(first.a), !is.na(first.b)),
> any(!is.na(second.a), !is.na(second.b)),
> any(!is.na(third.a), !is.na(third.b))))
>
> I apologize if the formatting is not seen properly. The above logically
> should work. And it should be extendable to scenarios where you want at
> least one of M columns to contain data as a group with N such groups of any
> size.
>
> But since it did not work, I tried a plan that did work and feels silly. I
> used mutate() to make new columns such as:
>
> result <-
> mydata |>
> mutate(
> usable.1 = (!is.na(first.a) | !is.na(first.b)),
> usable.2 = (!is.na(second.a) | !is.na(second.b)),
> usable.3 = (!is.na(third.a) | !is.na(third.b)),
> usable = (usable.1 & usable.2 & usable.3)
> ) |>
> filter(usable == TRUE)
>
> The above wastes time and effort making new columns so I can check the
> calculations then uses the combined columns to make a Boolean that can be
> used to filter the result.
>
> I know this is not the place to discuss dplyr. I want to check first if I am
> doing anything wrong in how I use any/all. One guess is that the generic is
> messed with by dplyr or other packages I libraried.
>
> And, of course, some aspects of delayed evaluation can interfere in subtle
> ways.
>
> I note I have had other problems with these base R functions before and
> generally solved them by not using them, as shown above. I would much rather
> use them, or something similar.
>
>
> Avi
>
>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list