# [R] x %>% y as an alternative to which( x > y)

Timothy Bates timothy.c.bates at gmail.com
Tue Sep 13 23:17:37 CEST 2011

```Dear Duncan and Hadley,

I stumbled across the NA behavior of subset a little while ago and thought it might do the trick. But my common usage case is not getting a subsetting sans NAs, but setting values in the whole dataframe.

So I need T/F at each row, not just the list of rows that match the subset of matching cases...

How would you do this with subset?

data[data\$YOB < 1908 & !is.na(data\$YOB), "Age"]=NA

My %<% idea extends the vocabulary established by %in%, and works in the same grammatical situation.

here's a real example

# Fix missing T2 sex for same sex pairs...

twinData[twinData\$Age %<% 12, "flynnEffect"] = FALSE # only set flynn F for people under 12, not inc NAs

Addressing Duncan's point about returning a logical array... the %<% function should be:

"%<%" <- function(table, x){
lessThan = table < x
lessThan[is.na(lessThan)] = FALSE
return(lessThan)
}

This also works for matrices as it should

> x = matrix(c(1:10,NA,12:20),nrow=2)
> x %<% 6
[,1] [,2]  [,3]  [,4]  [,5]  [,6]  [,7]  [,8]  [,9] [,10]
[1,] TRUE TRUE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[2,] TRUE TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE

On Sep 13, 2011, at 8:40 PM, Hadley Wickham wrote:

>> Because in coding, I often end up with big chunks looking like this:
>>
>> ((mydataframeName\$myvariableName > 2 & !is.na(mydataframeName\$myvariableName)) & (mydataframeName\$myotherVariableName == "male" & !is.na(mydataframeName\$myotherVariableName)))
>>
>> Which is much less readable/maintainable/editable than
>>
>> mydataframeName\$myvariableName > 2 & mydataframeName\$myotherVariableName == "male"
>
> Use subset:
>
> subset(mydataframeName, myvariableName > 2 & myotherVariableName == "male")
>
> (subset automatically treats NAs as false)
>