[R] data-management: Rowwise NA
Marc Schwartz
marc_schwartz at me.com
Thu Jun 3 21:45:58 CEST 2010
On Jun 3, 2010, at 2:20 PM, moleps wrote:
> Dear R´ers..
>
> In this mock dataset how can I generate a logical variable based on whether just tes or tes3 are NA in each row??
>
> test<-sample(c("A",NA,"B"),100,replace=T)
> test2<-sample(c("A",NA,"B"),100,replace=T)
> test3<-sample(c("A",NA,"B"),100,replace=T)
>
> tes<-cbind(test,test2,test3)
>
> sam<-c("test","test3")
> apply(subset(tes,select=sam),1,FUN=function(x) is.na(x))
>
> However this just tests whether each variable is missing or not per row. I´d like an -or- function in here that would provide one true/false per row based on whether test or tes3 are NA. I guess it would be easy to do it by subsetting in the example but I figure there is a more elegant way of doing it when -sam- contains 50 variables...
How about this:
set.seed(1)
test <- sample(c("A", NA, "B"), 100, replace = TRUE)
test2 <- sample(c("A", NA, "B"), 100, replace = TRUE)
test3 <- sample(c("A", NA, "B"), 100, replace = TRUE)
tes <- cbind(test, test2, test3)
> str(tes)
chr [1:100, 1:3] "A" NA NA "B" "A" "B" "B" NA NA ...
- attr(*, "dimnames")=List of 2
..$ : NULL
..$ : chr [1:3] "test" "test2" "test3"
> head(tes)
test test2 test3
[1,] "A" NA "A"
[2,] NA NA "A"
[3,] NA "A" NA
[4,] "B" "B" "A"
[5,] "A" NA "A"
[6,] "B" "A" NA
sam <- c("test","test3")
> rowSums(is.na(subset(tes, select = sam))) > 0
[1] FALSE TRUE TRUE FALSE FALSE TRUE TRUE TRUE TRUE FALSE FALSE
[12] FALSE FALSE TRUE FALSE TRUE TRUE FALSE TRUE TRUE FALSE FALSE
[23] TRUE TRUE FALSE TRUE FALSE TRUE TRUE TRUE TRUE TRUE TRUE
[34] TRUE TRUE FALSE FALSE TRUE TRUE TRUE FALSE TRUE FALSE TRUE
[45] TRUE FALSE TRUE TRUE TRUE FALSE TRUE FALSE TRUE TRUE TRUE
[56] FALSE FALSE TRUE TRUE TRUE TRUE FALSE TRUE FALSE TRUE FALSE
[67] TRUE FALSE TRUE FALSE TRUE FALSE TRUE TRUE TRUE FALSE FALSE
[78] TRUE TRUE FALSE TRUE TRUE TRUE TRUE FALSE FALSE FALSE FALSE
[89] FALSE FALSE FALSE FALSE TRUE FALSE TRUE FALSE TRUE TRUE FALSE
[100] TRUE
This avoids the looping involved in calling apply().
HTH,
Marc Schwartz
More information about the R-help
mailing list