[R] Remove columns from dataframe based on their statistics
J Toll
jctoll at gmail.com
Thu May 31 15:52:29 CEST 2012
On Thu, May 31, 2012 at 8:27 AM, Johannes Radinger <JRadinger at gmx.at> wrote:
> Hi,
>
> I have a dataframe and want to remove columns from it
> that are populated with a similar value (for the total
> column) (the variation of that column is 0). Is there an
> easier way than to calculate the statistics and then
> remove them by hand?
>
> A <- runif(100)
> B <- rep(1,100)
> C <- rep(2.42,100)
> D <- runif(100)
> df <- data.frame(A,B,C,D) # if want to conditionally remove column B and C as they show no variations
You could try something like:
for (i in seq(ncol(df), 1))
if (length(unique(df[, i])) == 1) {
df[, i] <- NULL
}
or for just numeric values:
for (i in seq(ncol(df), 1))
if (all(mean(df[, i]) == df[, i])) {
df[, i] <- NULL
}
HTH,
James
More information about the R-help
mailing list