[R] [FORGED] Handling "NA" in summation

Rolf Turner r.turner at auckland.ac.nz
Mon Sep 7 01:16:05 CEST 2015


On 07/09/15 10:22, Olu Ola via R-help wrote:
> Hello, I am currently working with a dataframe which has some missing
> values represented by "NA". whenever, I add two columns in which at
> least one of the pair of an observation is "NA", the sum returns
> zero. That is for the same observation, if
>
> dataframe$A = 20 dataframe$B = NA
>
> dataframe$A + dataframe$B  returns zero.

No it does not.  It returns NA.  As it should.

> I do not want to delete the observations with the NA's. How do I go
> about carrying out the necessary operations without deleting the
> observations with the NA's.

Your question seems to demonstrate a substantial amount of confusion.

Amongst other things you probably want to deal with vectors (or perhaps 
matrices) rather than data frames.

To sum a numeric vector, ignoring missing values, you can use the sum() 
function, setting the argument "na.rm" to TRUE.  E.g.

    v <- c(1,NA,2,NA,3,NA,4,NA)
    sum(v,na.rm=TRUE) # Gives 10.

Ignore other advice that you were given, to replace NAs in your data 
frame (???) by zeroes.  That is very dangerous, misleading and 
confusing.  "Missing" and "zero" are *VERY* different concepts.

cheers,

Rolf Turner


-- 
Technical Editor ANZJS
Department of Statistics
University of Auckland
Phone: +64-9-373-7599 ext. 88276



More information about the R-help mailing list