[R] Why does R replace all row values with NAs

Dimitri Liakhovitski dimitri.liakhovitski at gmail.com
Fri Feb 27 15:04:29 CET 2015


I know how to get the output I need, but I would benefit from an
explanation why R behaves the way it does.

# I have a data frame x:
x = data.frame(a=1:10,b=2:11,c=c(1,NA,3,NA,5,NA,7,NA,NA,10))
x
# I want to toss rows in x that contain values >=6. But I don't want
to toss my NAs there.

subset(x,c<6) # Works correctly, but removes NAs in c, understand why
x[which(x$c<6),] # Works correctly, but removes NAs in c, understand why
x[-which(x$c>=6),] # output I need

# Here is my question: why does the following line replace the values
of all rows that contain an NA # in x$c with NAs?

x[x$c<6,]  # Leaves rows with c=NA, but makes the whole row an NA. Why???
x[(x$c<6) | is.na(x$c),] # output I need - I have to be super-explicit

Thank you very much!

-- 
Dimitri Liakhovitski



More information about the R-help mailing list