[R] Sum function and missing values --- need to mimic SAS sum function

Allen Bingham aebingham2 at gmail.com
Mon Jan 26 22:49:12 CET 2015


Don,

The default for the sum function is to NOT remove NA before summing (i.e.,
option na.rm=FALSE), here's the results with na.rm=TRUE

> sum(NA,na.rm=TRUE)
[1] 0
> sum(c(NA,NA),na.rm=TRUE)
[1] 0
> sum(rep(NA,10),na.rm=TRUE)
[1] 0
> sum(as.numeric(letters[1:4]),na.rm=TRUE)
[1] 0
Warning message:
NAs introduced by coercion 

Hope that explains it a bit better.

Others have replied with suggested solutions to my 'problem', and the one by
John Fox is what I need (an actual function that I can use in an apply
statement), although the suggested code by Sven Templer is appealing in its
simplicity.

Allen
-----Original Message-----
From: MacQueen, Don [mailto:macqueen1 at llnl.gov] 
Sent: Monday, January 26, 2015 1:03 PM
To: Allen Bingham; r-help at r-project.org
Subject: Re: [R] Sum function and missing values --- need to mimic SAS sum
function

I'm a little puzzled by the assertion that the result is 0.0 when all the
elements are NA:

> sum(NA)
[1] NA

> sum(c(NA,NA))
[1] NA

> sum(rep(NA, 10))
[1] NA

> sum(as.numeric(letters[1:4]))
[1] NA
Warning message:
NAs introduced by coercion


Considering that the example snippet of code has several other aspects
besides using sum(), among them subsetting rows of a data frame when there
are apparently NAs in some its variables ... I wonder if the reason for the
failure of that snippet has been misunderstood?


--
Don MacQueen

Lawrence Livermore National Laboratory
7000 East Ave., L-627
Livermore, CA 94550
925-423-1062





On 1/25/15, 3:21 PM, "Allen Bingham" <aebingham2 at gmail.com> wrote:

>I understand that in order to get the sum function to ignore missing 
>values I need to supply the argument na.rm=TRUE. However, when summing 
>numeric values in which ALL components are "NA" ... the result is 0.0 
>... instead of (what I would get from SAS) of NA (or in the case of SAS 
>".").
>
>Accordingly, I've had to go to 'extreme' measures to get the sum 
>function to result in NA if all arguments are missing (otherwise give 
>me a sum of all non-NA elements).
>
>So for example here's a snippet of code that ALMOST does what I want:
>
> 
>SumValue<-apply(subset(InputDataFrame,!is.na(Variable.1)|!is.na(Variabl
>e.2
>),
>select=c(Variable.1,Variable.2)),1,sum,na.rm=TRUE)
>
>In reality this does NOT give me records with NA for SumValue ... but 
>it doesn't give me values for any records in which both Variable.1 and
>Variable.2 are NA --- which is "good enough" for my purposes.
>
>I'm guessing with a little more work I could come up with a way to 
>adapt the code above so that I could get it to work like SAS's sum 
>function ...
>
>... but before I go that extra mile I thought I'd ask others if they 
>know of functions in either base R ... or in a package that will better 
>mimic the SAS sum function.
>
>Any suggestions?
>
>Thanks.
>______________________________________
>Allen Bingham
>aebingham2 at gmail.com
>
>______________________________________________
>R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see 
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list