[R] sometimes removing NAs from code
Sarah Goslee
sarah.goslee at gmail.com
Wed Oct 26 17:50:06 CEST 2011
Hi,
On Wed, Oct 26, 2011 at 11:25 AM, Schatzi <adele_thompson at cargill.com> wrote:
> Sometimes I have NA values within specific columns of a dataframe (in this
> example, the first two columns can have NAs). If there are NA values, I
> would like them to be removed.
>
> I have been using the code:
>
> y<-c(NA,5,4,2,5,6,NA)
> z<-c(NA,3,4,NA,1,3,7)
> x<-1:7
> adata<-data.frame(y,z,x)
> adata<-adata[-which(apply(adata[,1:2],1,function(x)any(is.na(x)))),]
>
> This works well if there are NA values, but when a dataset doesn't have NA
> values, this code messes up the dataframe. I was trying to pick apart this
> code and could not understand why it didn't work when there were no NA
> values.
Thanks for the example. Your problem is because of the which() statement.
If there are NA values, which() returns the row numbers where the NAs are:
> which(apply(adata[,1:2],1,function(x)any(is.na(x))))
[1] 1 4 7
> bdata <- data.frame(1:7, 1:7, 1:7)
> which(apply(bdata[,1:2],1,function(x)any(is.na(x))))
integer(0)
But if there aren't any, which() returns 0. How does R subset on a row
index of 0?
Unhelpfully.
Fortunately you don't need the which() at all: the logical vector
returned by your
apply statement is entirely sufficient (with added negation):
> adata[apply(adata[,1:2],1,function(x)!any(is.na(x))), ]
y z x
2 5 3 2
3 4 4 3
5 5 1 5
6 6 3 6
> bdata[apply(bdata[,1:2],1,function(x)!any(is.na(x))), ]
X1.7 X1.7.1 X1.7.2
1 1 1 1
2 2 2 2
3 3 3 3
4 4 4 4
5 5 5 5
6 6 6 6
7 7 7 7
Sarah
>
> If there are no NA values and I run just the part:
> apply(adata[,1:2],1,function(x)any(is.na(x)))
> it results in:
> 2 3 5 6
> FALSE FALSE FALSE FALSE
>
> I was thinking that I can put in an if statement, but I think there has to
> be a better way.
>
> Any ideas/help? Thank you.
>
--
Sarah Goslee
http://www.functionaldiversity.org
More information about the R-help
mailing list