[R] column-wise deletion in data-frames

Mon Jul 18 14:50:56 CEST 2005

jhainm at fas.harvard.edu wrote:
> Hi,
> 
> I have a huge dataframe and like to delete all those variables from it that that
> have NAs. The deletion of vars should be done column-wise, and not row-wise as
> na.omit would do it, because I have some vars that have NAs for all rows thus
> using na.omit I would end up with no obs. Is there a convenient way to do this
> R?
> 
> To make the question more explicit. Imagine a dataset that looks something like
> this (but much bigger)
> 
> X1 <- rnorm(1000)
> X2 <- c(rep(NA,1000))
> X3 <- rnorm(1000)
> X4 <- c(rep(NA,499),1,44,rep(NA,499))
> X5 <- rnorm(1000)
> 
> data <- as.data.frame(cbind(X1,X2,X3,X4,X5))
> 
> So only X1, X3 and X5 are vars without any NAs and there are some vars (X2 and
> X4 stacked in between that have NAs). Now, how can I extract those former vars
> in a new dataset or remove all those latter vars in between that have NAs
> (without missing a single row)?
> ...

   Someone else will probably suggest something more elegant, but how 
about this:

newdata <- data[,-which(apply(data, 2, function(x){all(is.na(x))}))]

-- 
Chuck Cleland, Ph.D.
NDRI, Inc.
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 452-1424 (M, W, F)
fax: (917) 438-0894