[R] drop columns whose rows are all 0
Rolf Turner
rolf.turner at xtra.co.nz
Tue Jan 24 22:01:13 CET 2012
On 25/01/12 05:14, Francisco wrote:
> Hello,
> I have a dataset with 40 variables, some of them are always 0 (each
> row). I would like to make a subset containing only the columns which
> values are not all 0, but I don't know how to do it.
>
> I tried:
>
> for(cut_column in 1:40) {
>
> if(sum(dataset[,cut_column])!=0) {
> columns_useful<-c(columns_useful,dataset[cut_column])
>
> }
> }
>
> sorted_dataset<-subset(dataset, select=columns_useful)
>
> But it doesn't work.
Try:
good_dataset <- dataset[,sapply(dataset,function(x){!all(x==0)})]
This works modulo possible gotchas induced by floating point arithmetic.
Another possibility:
tol <- sqrt(.Machine$double.eps)
good_dataset <-
dataset[,sapply(dataset,function(x){!all(abs(x)<=tol)})]
Or:
good_dataset <-
dataset[,sapply(dataset,function(x){!isTRUE(all.equal(x,rep(0,length(x))))})]
The foregoing could trip up if some columns of "dataset" have extra
attributes tagging
along. E.g. the column could actually be a numeric matrix of zeroes ---
in which case
it wouldn't get dropped.
cheers,
Rolf Turner
More information about the R-help
mailing list