[R] Adding two or more columns of a data frame for each row when NAs are present.

Ian Strang hamamelis at ntlworld.com
Sun Nov 20 21:38:11 CET 2011


I am fairly new to R and would like help with the problem below. I am 
trying to sum and count several rows in the data frame yy below. All works 
well as in example 1. When I try to add the columns, with an NA in Q21, I 
get as NA as mySum. I would like NA to be treated as O, or igored.
I wrote a function to try to count an NA element as 0, Example 3 function. 
It works with a few warnings, Example 4, but still gives NA instead of the 
addition when there is an NA in an element.

In Example 6 & 7, I tried using sum() but it just sums the whole data 
frame, I think,

How do I add together several columns giving the result for each row in 
mySum? NA should be treated as a 0. Please, note, I do not want to sum all 
the columns, as I think rowSums would do, just the selected ones.

Thanks for your help.
Ian,

 > yy <- read.table( header = T, sep=",", text =     ## to create a data frame
+ "Q20, Q21, Q22, Q23, Q24
+  0,1, 2,3,4
+  1,NA,2,3,4
+  2,1, 2,3,4")
+  yy
   Q20 Q21 Q22 Q23 Q24
1   0   1    2   3   4
2   1  NA   2   3   4
3   2   1    2   3   4

 > x <- transform( yy,     ############## Example 1
+   mySum = as.numeric(Q20) + as.numeric(Q22) + as.numeric(Q24),
+   myCount = 
as.numeric(!is.na(Q20))+as.numeric(!is.na(Q21))+as.numeric(!is.na(Q24))
+ )
+ x
   Q20 Q21 Q22 Q23 Q24 mySum myCount
1   0   1    2   3   4     6       3
2   1  NA   2   3   4     7       2
3   2   1    2   3   4     8       3
 >
+ x <- transform( yy,     ################ Example 2
+   mySum = as.numeric(Q20) + as.numeric(Q21) + as.numeric(Q24),
+   myCount = 
as.numeric(!is.na(Q20))+as.numeric(!is.na(Q21))+as.numeric(!is.na(Q24))
+ )
+ x
   Q20 Q21 Q22 Q23 Q24 mySum myCount
1   0   1    2   3   4     5       3
2   1  NA   2   3   4    NA       2
3   2   1    2   3   4     7       3

 > NifAvail <- function(x) { if (is.na(x)) x<-0 else x <- x   
############### Example 3
+   return(as.numeric(x))
+ } #end function
+ NifAvail(5)
[1] 5
+ NifAvail(NA)
[1] 0

 > x <- transform( yy,
+   mySum = NifAvail(Q20) + NifAvail(Q22) + NifAvail(Q24),    
############### Example 4
+   myCount = 
as.numeric(!is.na(Q20))+as.numeric(!is.na(Q21))+as.numeric(!is.na(Q24))
+ )
Warning messages:
1: In if (is.na(x)) x <- 0 else x <- x :
   the condition has length > 1 and only the first element will be used
2: In if (is.na(x)) x <- 0 else x <- x :
   the condition has length > 1 and only the first element will be used
3: In if (is.na(x)) x <- 0 else x <- x :
   the condition has length > 1 and only the first element will be used
 > x
   Q20 Q21 Q22 Q23 Q24 mySum myCount
1   0   1    2   3   4     6       3
2   1  NA   2   3   4     7       2
3   2   1    2   3   4     8       3
 > x <- transform( yy,
+   mySum = NifAvail(Q20) + NifAvail(Q21) + NifAvail(Q24),     
################ Example 5
+   myCount = 
as.numeric(!is.na(Q20))+as.numeric(!is.na(Q21))+as.numeric(!is.na(Q24))
+ )
Warning messages:
1: In if (is.na(x)) x <- 0 else x <- x :
   the condition has length > 1 and only the first element will be used
2: In if (is.na(x)) x <- 0 else x <- x :
   the condition has length > 1 and only the first element will be used
3: In if (is.na(x)) x <- 0 else x <- x :
   the condition has length > 1 and only the first element will be used
 > x
   Q20 Q21 Q22 Q23 Q24 mySum myCount
1   0   1    2   3   4     5       3
2   1  NA   2   3   4    NA       2
3   2   1    2   3   4     7       3


 > x <- transform( yy,                                        ############ 
Example 6
+   mySum = sum(as.numeric(Q20), as.numeric(Q21), as.numeric(Q23), na.rm=T),
+   myCount = 
as.numeric(!is.na(Q20))+as.numeric(!is.na(Q21))+as.numeric(!is.na(Q24))
+ )
+ x
   Q20 Q21 Q22 Q23 Q24 mySum myCount
1   0   1    2   3   4    14       3
2   1  NA   2   3   4    14       2
3   2   1    2   3   4    14       3

 > x <- transform( yy,                                       ############# 
Example 7
+   mySum = sum(as.numeric(Q20), as.numeric(Q22), as.numeric(Q23), na.rm=T),
+   myCount = 
as.numeric(!is.na(Q20))+as.numeric(!is.na(Q21))+as.numeric(!is.na(Q24))
+ )
+ x
   Q20 Q21 Q22 Q23 Q24 mySum myCount
1   0   1    2   3   4    18       3
2   1  NA   2   3   4    18       2
3   2   1    2   3   4    18       3



More information about the R-help mailing list