[R] subsetting with NA's

David Kane <David Kane a296180 at mica.fmr.com
Mon Apr 8 12:38:53 CEST 2002


Hi,

I often have large dataframes with many variables and many NA's, from which I
would like to subset out some rows. Here is a toy example:

> x <- data.frame(a = c("x", "y", "z"), b = c(1, NA, 5))
> x
  a  b
1 x  1
2 y NA
3 z  5

I realize that, if I know the values in x$b that I want to subset, things are easy:

> x[x$b %in% c(1),]
  a b
1 x 1

However, if I only know the *range", then the NA's will flomux me.

> x[x$b < 3,]
    a  b
1   x  1
NA NA NA

Of course, I can explicitly avoid the NA's by doing something like:

> x[x$b < 3 & ! is.na(x$b),]
  a b
1 x 1

My problem is that this sort of syntax can become quite annoying when their are
many variables in the subseting expression. That is, I want to writing
something like:

x[x$b < 3 & x$c > 5 & x$d > 100,]

without having to write:

x[! is.na(x$b) & ! is.na(x$c) & ! is.na(x$d) & x$b < 3 & x$c > 5 & x$d > 100,]

Is there a trick for achieving this, for ignoring all NA's during subsetting?

To the extent that it matters: 

> version
         _                   
platform sparc-sun-solaris2.6
arch     sparc               
os       solaris2.6          
system   sparc, solaris2.6   
status   Patched             
major    1                   
minor    4.0                 
year     2002                
month    01                  
day      13                  
language R                   
> 

Thanks,

Dave Kane
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._



More information about the R-help mailing list