[R] assumptions about how things are done

```Hi Avi,
Definitely a learning moment. I may consider writing an ifElse() for
my own use and sharing it if anyone wants it.

Jim

>
> This is supposed to be a forum for help so general and philosophical
> discussions belong elsewhere, or nowhere.
>
>
>
> Having said that, I want to make a brief point. Both new and experienced
> people make implicit assumptions about the code they use. Often nobody looks
> at how the sausage is made. The recent discussion of ifelse() made me take a
> look and I was not thrilled.
>
>
>
> My NAÏVE view was that ifelse() was implemented as a sort of loop construct.
> I mean if I have a vector of length N and perhaps a few other vectors of the
> same length, I might say:
>
>
>
> result <- ifelse(condition-on-vector-A, result-if-true-using-vectors,
> result-if-false-using-vectors)
>
>
>
> So say I want to take a vector of integers from 1 to N and make an output a
> second vector where you have either a prime number or NA. If I have a
> function called is.prime() that checks a single number and returns
> TRUE/FALSE, it might look like this:
>
>
>
> primed <- ifelse(is.prime(A, A, NA)
>
>
>
> So A[1] will be mapped to 1 and A[2} to 2 and A[3] to 3, but A[4] being
> composite becomes NA and so on.
>
>
>
> If you wrote the above using loops, it would be to range from index 1 to N
> and apply the above. There are many complications as R allows vectors to be
> longer or to be repeated as needed.
>
>
>
> What I found ifelse() as implemented to do, is sort of like this:
>
>
>
> Make a vector of the right length for the results, initially empty.
>
>
>
> Make a vector evaluating the condition so it is effectively a Boolean
> result.
>
> Calculate which indices are TRUE. Secondarily, calculate another set of
> indices that are false.
>
>
>
> Calculate ALL the THEN conditions and ditto all the ELSE conditions.
>
>
>
> Now copy into the result all the THEN values indexed by the TRUE above and
> than all the ELSE values indicated by the FALSE above.
>
>
>
> In plain English, make a result from two other results based on picking
>
>
>
> That is not a bad algorithm and in a vectorized language like R, maybe even
> quite effective and efficient. It does lots of extra work as by definition
> it throws at least half away.
>
>
>
> I suspect the implementation could be made much faster by making some of it
> done internally using a language like C.
>
>
>
> But now that I know what this implementation did, I might have some qualms
> at using it in some situations. The original complaint led to other
> observations and needs and perhaps blindly using a supplied function like
> ifelse() may not be a decent solution for some needs.
>
>
>
> I note how I had to reorient my work elsewhere using a group of packages
> called the tidyverse when they added a function to allow rowwise
> manipulation of the data as compared to an ifelse-like method using all
> columns at once. There is room for many approaches and if a function may not
> be doing quite what you want, something else may better meet your needs OR
> you may want to see if you can copy the existing function and modify it for
>
>
>
> In the case we mentioned, the goal was to avoid printing selected warnings.
> Since the function is readable, it can easily be modified in a copy to find
> what is causing the warnings and either rewrite a bit to avoid them or start
> over with perhaps your own function that tests before doing things and
> avoids tripping the condition (generating a NaN) entirely.
>
>
>
> Like may languages, R is a bit too rich. You can piggyback on the work of
> others but with some caution as they did not necessarily have you in mind
> with what they created.
>
>
>
>
>
>
>
