[R] sum() with na.rm=TRUE, again
Richards, Tom
richards at upci.pitt.edu
Thu Apr 25 17:25:32 CEST 2002
Hi:
I remember a post several days ago by Jon Baron, concerning the
behavior of sum() when one sets na.rm=TRUE:
the result will be a zero sum for a vector of all NA's, as here, for the
second row:
> ss<- data.frame(x=c(1,NA,3,4),y=c(2,NA,4,NA))
> ss
x y
1 1 2
2 NA NA
3 3 4
4 4 NA
> apply(ss,1,sum,na.rm=TRUE)
1 2 3 4
3 0 7 4
I am rather alarmed by that zero, because I was just about to place the sum
function into am apply() on a rather large data management project, where
about 5% of my matrix rows have two missing values. Is there a "safe" way
to use sum(), so that such zeroes are not created? A safe.sum() that takes
arguments just as general as sum()? I mean, I think I could get around this
little problem like this,
apply(ss,1,function(x){ifelse(all(is.na(x)),NA,sum(!is.na(x))*mean(x,na.rm=T
RUE))})
1 2 3 4
3 NA 7 4
but is there a safer way to write a sum() function? Or, do these zeroes
serve some purpose that I am missing?
Thanks in advance...
Tom
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
More information about the R-help
mailing list