[R] a better method than a long expression with many OR clauses

Steve Lianoglou lianoglou.steve at gene.com
Tue Dec 17 21:13:36 CET 2013


Hi Chris,

(extra compelled to answer a Q from my undergrad alma mater :-)

see below:

On Tue, Dec 17, 2013 at 11:13 AM, Christopher W Ryan
<cryan at binghamton.edu> wrote:
> dd <- data.frame(longVariableName1=sample(1:4, 10, replace=TRUE),
> longVariableName2=sample(1:4, 10, replace=TRUE))
> dd
> # define who is a case and who is not
> transform(dd, case=(longVariableName1==3 | longVariableName2==3))
>
> But in reality I have 9 of those longVariableName variables,
> all of this pattern: alphaCauseX, where X is an integer 1:9.
> For any given observation, if any of them == 3, then case=TRUE
> Is there a shorter or more elegant way of doing this than
> typing out that long string of 9 OR clauses?
>
> I read about any(), but couldn't quite make that do what I want. Maybe
> I was using it wrong.

There are many ways to approach this,here is but one. The general idea is to:

(1) Crate a logical matrix from the appropriate columns in `dd`
(2) Check to see which rows have any vals == 3

Let's say columns 3:11 have the variable you want to check. The code
below will return a vector as long as there are rows in `dd` which are
TRUE if any value in the row == 3:

R> is.case <- rowSums(as.matrix(dd[, 3:11]) == 3) > 0)

Unwind that one liner into it's individual parts to see who is doing
what there.

HTH,
-steve

-- 
Steve Lianoglou
Computational Biologist
Genentech



More information about the R-help mailing list