[R] Excluding all teh columns from a data frame if the standard deviation of that column is zero(0).
R. Michael Weylandt
michael.weylandt at gmail.com
Tue Oct 16 12:24:41 CEST 2012
On Tue, Oct 16, 2012 at 9:08 AM, siddu479 <onlyfordigitalstuff at gmail.com> wrote:
> Hi All,
>
> I have a data frame where nearly 10K columns of data, where most of them
> have standard deviation( of all rows) as zero.
> I want to exclude all the columns from the data frame and proceed to further
> processing.
>
> I tried like blow.
> *data <- read.csv("data.CSV", header=T)
>
> for(i in 2:ncol(data))
> if(sd(data[,i])==0){
> df[,i] <-NULL
> }
> *
> where I have the data columns from 2:ncol, but getting the error "Error in
> df[, i] <- NULL : object of type 'closure' is not subsettable"
>
> Can any one suggest the right method to accomplish this.
>
A perfect example of why "df" is a bad function name. Here you are
getting the function ( = closure, more or less) df, density function
of the F distribution, instead of the uninitialized variable "df".
Since the function can't be subsetted, you get the error.
In fact, I think you really just want this one liner:
!(apply(data, 2, sd) == 0)
which can be used to subset.
In the same vein as the df problem, data is also a bad function name
(it's also a pre-defined function used for loading, surprise
surprise!, data) but R is smart enough to keep them straight in this
simple example. In your real script, however, I'd strongly suggest you
change it.
Cheers,
Michael
More information about the R-help
mailing list