[R] 2 small problems: integer division and the nature of NA
Gabor Grothendieck
ggrothendieck at myway.com
Fri Feb 4 20:48:44 CET 2005
Denis Chabot <chabotd <at> globetrotter.net> writes:
: The sum of a vector having at least one NA but also valid data gives NA
: if we do not specify na.rm=T. But with na.rm=T, we are telling sum to
: give the sum of valid data, ignoring NAs that do not tell us anything
: about the value of a variable. I found out while getting the sum of
: small subsets of my data (such as when subsetting by several
: variables), sometimes a "cell" only contained NAs for my response
: variable. I would have expected the sum to be NA in such cases, as I do
: not have a single data point telling me the value of my response here.
: But R tells me the sum was zero in that cell! Was this behavior
: considered "desirable" when sum was built? If not, any hope it will be
: fixed?
Think of it this way: If u and v are index vectors then its desirable that
sum(x[u]) + sum(x[v]) == sum(x[c(u,v)])
hold for zero length index vectors too in which case
sum(numeric()) should be zero, not NA.
If you want a short expression that gives NA for zero length x try this:
sum(x) + if (length(x)) 0 else NA
or define your own function, sum0, say.
More information about the R-help
mailing list