[R] data-management: Rowwise NA

Marc Schwartz marc_schwartz at me.com
Thu Jun 3 21:45:58 CEST 2010


On Jun 3, 2010, at 2:20 PM, moleps wrote:

> Dear R´ers..
> 
> In this mock dataset how can I generate a logical variable based on whether just tes or tes3 are NA in each row?? 
> 
> test<-sample(c("A",NA,"B"),100,replace=T)
> test2<-sample(c("A",NA,"B"),100,replace=T)
> test3<-sample(c("A",NA,"B"),100,replace=T)
> 
> tes<-cbind(test,test2,test3)
> 
> sam<-c("test","test3")
> apply(subset(tes,select=sam),1,FUN=function(x) is.na(x))
> 
> However this just tests whether each variable is missing or not per row. I´d like an -or- function in here that would provide one true/false per row based on whether test or tes3 are NA. I guess it would be easy to do it by subsetting in the example but I figure there is a more elegant way of doing it when -sam- contains 50 variables...


How about this:

set.seed(1)
test <- sample(c("A", NA, "B"), 100, replace = TRUE)
test2 <- sample(c("A", NA, "B"), 100, replace = TRUE)
test3 <- sample(c("A", NA, "B"), 100, replace = TRUE)

tes <- cbind(test, test2, test3)

> str(tes)
 chr [1:100, 1:3] "A" NA NA "B" "A" "B" "B" NA NA ...
 - attr(*, "dimnames")=List of 2
  ..$ : NULL
  ..$ : chr [1:3] "test" "test2" "test3"

> head(tes)
     test test2 test3
[1,] "A"  NA    "A"  
[2,] NA   NA    "A"  
[3,] NA   "A"   NA   
[4,] "B"  "B"   "A"  
[5,] "A"  NA    "A"  
[6,] "B"  "A"   NA   


sam <- c("test","test3")

> rowSums(is.na(subset(tes, select = sam))) > 0
  [1] FALSE  TRUE  TRUE FALSE FALSE  TRUE  TRUE  TRUE  TRUE FALSE FALSE
 [12] FALSE FALSE  TRUE FALSE  TRUE  TRUE FALSE  TRUE  TRUE FALSE FALSE
 [23]  TRUE  TRUE FALSE  TRUE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
 [34]  TRUE  TRUE FALSE FALSE  TRUE  TRUE  TRUE FALSE  TRUE FALSE  TRUE
 [45]  TRUE FALSE  TRUE  TRUE  TRUE FALSE  TRUE FALSE  TRUE  TRUE  TRUE
 [56] FALSE FALSE  TRUE  TRUE  TRUE  TRUE FALSE  TRUE FALSE  TRUE FALSE
 [67]  TRUE FALSE  TRUE FALSE  TRUE FALSE  TRUE  TRUE  TRUE FALSE FALSE
 [78]  TRUE  TRUE FALSE  TRUE  TRUE  TRUE  TRUE FALSE FALSE FALSE FALSE
 [89] FALSE FALSE FALSE FALSE  TRUE FALSE  TRUE FALSE  TRUE  TRUE FALSE
[100]  TRUE


This avoids the looping involved in calling apply().

HTH,

Marc Schwartz



More information about the R-help mailing list