[R] [FORGED] Handling "NA" in summation
Rolf Turner
r.turner at auckland.ac.nz
Mon Sep 7 01:16:05 CEST 2015
On 07/09/15 10:22, Olu Ola via R-help wrote:
> Hello, I am currently working with a dataframe which has some missing
> values represented by "NA". whenever, I add two columns in which at
> least one of the pair of an observation is "NA", the sum returns
> zero. That is for the same observation, if
>
> dataframe$A = 20 dataframe$B = NA
>
> dataframe$A + dataframe$B returns zero.
No it does not. It returns NA. As it should.
> I do not want to delete the observations with the NA's. How do I go
> about carrying out the necessary operations without deleting the
> observations with the NA's.
Your question seems to demonstrate a substantial amount of confusion.
Amongst other things you probably want to deal with vectors (or perhaps
matrices) rather than data frames.
To sum a numeric vector, ignoring missing values, you can use the sum()
function, setting the argument "na.rm" to TRUE. E.g.
v <- c(1,NA,2,NA,3,NA,4,NA)
sum(v,na.rm=TRUE) # Gives 10.
Ignore other advice that you were given, to replace NAs in your data
frame (???) by zeroes. That is very dangerous, misleading and
confusing. "Missing" and "zero" are *VERY* different concepts.
cheers,
Rolf Turner
--
Technical Editor ANZJS
Department of Statistics
University of Auckland
Phone: +64-9-373-7599 ext. 88276
More information about the R-help
mailing list