[R] I need to create new variables based on two numeric variables and one dichotomize conditional category
Jorgen Harmse
JH@rm@e @end|ng |rom roku@com
Mon Nov 6 17:52:14 CET 2023
Avi: Thank you for checking. I think the optimization is limited. If test is all TRUE or all FALSE then at most one vector is evaluated. Anything beyond that would be very complicated. (Inspect the two expressions and verify that both specify elementwise computations. Then use indexing to shrink the input properly. Take into account all recycling rules for binary operations.)
> ifelse(0:1, log(-1:0), 1:2)
Warning in log(-1:0) : NaNs produced
[1] 1 -Inf
> ifelse(c(FALSE,FALSE), log(-1:0), 1:2)
[1] 1 2
I agree that nested ifelse is cumbersome. I wrote a function to address that:
#' Nested conditional element selection
#'
#' \code{ifelses(test1,yes1,test2,yes2,....,no)} is shorthand for
#' \code{ifelse(test1,yes1,ifelse(test2,yes2,....,no....))}. The inputs should
#' not be named.
#'
#' @param test1 usually \code{test} for the outer call to \code{\link{ifelse}}
#' @param yes1 \code{yes} for the outer call to \code{ifelse}
#' @param ... usually the \code{(test,yes)} for nested calls followed by \code{no}
#' for the innermost call to \code{ifelse}
#'
#' @note There must be an odd number of inputs. If there is exactly one input then it is
#' returned (unless it is named \code{yes1}): this supports the recursive implementation.
#'
#' @return a vector with entries from \code{yes1} where \code{test1} is \code{TRUE}, else from
#' \code{yes2} where \code{test2} is \code{TRUE}, ..., and from \code{no} where none of
#' the conditions holds
#'
#' @export
ifelses <- function(test1,yes1,...)
{ if (missing(test1))
{ if (!missing(yes1) || length(L <- list(...)) != 1L)
stop("Wrong number of arguments or confusing argument names.")
return(L[[1L]])
}
if (missing(yes1))
{ if (length(L <- list(...)) != 0L)
stop("Wrong number of arguments or confusing argument names.")
return(test1)
}
return( ifelse(test1, yes1, ifelses(...)) )
}
Regards,
Jorgen Harmse (not Jordan).
------------------------------
Message: 10
Date: Sat, 4 Nov 2023 01:08:03 -0400
From: <avi.e.gross using gmail.com>
To: "'Jorgen Harmse'" <JHarmse using roku.com>
Cc: <r-help using r-project.org>
Subject: Re: [R] [EXTERNAL] RE: I need to create new variables based
on two numeric variables and one dichotomize conditional category
variables.
Message-ID: <019a01da0edc$e41c39e0$ac54ada0$@gmail.com>
Content-Type: text/plain; charset="utf-8"
To be fair, Jordan, I think R has some optimizations so that the arguments
in some cases are NOT evaluated until needed. So only one or the other
choice ever gets evaluated for each row. My suggestion merely has
typographic implications and some aspects of clarity and minor amounts of
less memory and parsing needed.
But ifelse() is currently implemented somewhat too complexly for my taste.
Just type "ifelse" at the prompt and you will see many lines of code that
handle various scenarios.
�
If you later want to add categories such as �transgender� with a value of 61 or have other numbers for groups like �Hispanic male�, you can amend the instructions as long as you put your conditions in an order so that they are tried until one of them matches, or it takes the default. Yes, in a sense the above is doable using a deeply nested ifelse() but easier for me to read and write and evaluate. It may not be more efficient or may be as some of dplyr is compiled code.
[[alternative HTML version deleted]]
More information about the R-help
mailing list