[R] Removing constants from a data frame
Martin Maechler
maechler at stat.math.ethz.ch
Mon Sep 20 13:19:41 CEST 2004
>>>>> "AndyL" == Liaw, Andy <andy_liaw at merck.com>
>>>>> on Mon, 20 Sep 2004 06:22:33 -0400 writes:
>> From: Kjetil Brinchmann Halvorsen
>>
>> David Forrest wrote:
>>
>> >Suppose I have
>> >
>> >x<-data.frame(v1=1:4, v2=c(2,4,NA,7), v3=rep(1,4),
>> > v4=LETTERS[1:4],v5=rep('Z',4))
>> >
>> >or a much larger frame, and I wish to test for and remove
>> the constant
>> >numeric columns.
>> >
>> >I made:
>> >
>> > is.constant<-function(x){identical(min(x),max(x))}
>> >
>> >and
>> > apply(x,2,is.constant) # Works for numerics
>> > x[,-which(apply(x,2,is.constant))]
>> >
>> >I'd really like to be able to delete the constant columns
>> without losing
>> >my non-numerics. Ignoring the character columns would be OK.
>> >
>> >Any suggestions?
>> >
>> >Dave
>> >
>> >
>> what about defing is.constant as
>> is.constant <- function(x) {
>> if (is.numeric(x)) identical(min(x), max(x)) else FALSE }
AndyL> identical() is probably not the safest thing to use:
>> x <- c(1, 2, NA)
>> is.constant(x)
AndyL> [1] TRUE
AndyL> For data such as c(1, 1, 1, NA), I should think the
AndyL> safest answer should be NA, because one really
AndyL> doesn't know whether that last number is 1 or not.
yes.
Also note that is.numeric() is not what you want for data.frame
columns since it isn't true for factors and you may want to
remove constant factor columns as well.
Martin Maechler
More information about the R-help
mailing list