[R] strange behaviour of median

Karl Ove Hufthammer karl at huftis.org
Thu Feb 4 12:20:43 CET 2010


On Thu, 4 Feb 2010 12:04:32 +0100 Karl Ove Hufthammer <karl at huftis.org> 
wrote:
> It's not exactly a bug, since 'median' is not documented to work on data 
> frames (use 'sapply' or 'apply' for that),

Note that this is slightly more complicated than what would appear at 
first sight. Both 'sapply' and 'apply' work fine on the original 
example,

mat <- matrix(1:16, 4,4)
df1 <- data.frame(mat)

but not if we change the class of one of the variables/columns:

df1[,2]=as.Date(df1[,2], origin="2000-01-02")

Here 'sapply(df1,median)' returns a numeric vector, not a data frame,

> sapply(df1,median)
     X1      X2      X3      X4 
    2.5 10964.5    10.5    14.5 

so the second 'median' value is a number, not a date. And
'apply(df1,2,median)' returns an ugly and confusing error message.

Here's my proposed 'median for data frames', which should work fine 
everywhere that 'median' does:

median.data.frame=function(df, ...)
{
  res = df[1,]
  for(i in seq_along(df))
    res[1,i] = median(df[,i], ...)
  res
}

-- 
Karl Ove Hufthammer



More information about the R-help mailing list