[R] hiccup in apply?

Gavin Simpson gavin.simpson at ucl.ac.uk
Fri Jan 19 18:22:44 CET 2007


On Fri, 2007-01-19 at 11:36 -0500, bogdan romocea wrote:
> Hello, I don't understand the behavior of apply() on the data frame below.
> 
> test <-
> structure(list(Date = structure(c(13361, 13361, 13361, 13361,
> 13361, 13361, 13361, 13361, 13362, 13362, 13362, 13362, 13362,
> 13362, 13362, 13362, 13363, 13363, 13363, 13363, 13363, 13363,
> 13363, 13363, 13364, 13364, 13364, 13364, 13364, 13364, 13364,
> 13364, 13365, 13365, 13365, 13365, 13365, 13365, 13365, 13365,
> 13366, 13366, 13366, 13366, 13366, 13366, 13366, 13366, 13367,
> 13367), class = "Date"), RANK = as.integer(c(19, 7, 5, 4, 6,
> 3, 3, 4, 18, 7, 6, 4, 6, 3, 3, 4, 19, 7, 6, 4, 6, 3, 3, 4, 18,
> 6, 7, 4, 6, 3, 3, 4, 18, 6, 7, 4, 6, 3, 3, 4, 18, 6, 7, 4, 6,
> 3, 3, 4, 18, 6))), .Names = c("Date", "RANK"), row.names = c("1",
> "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13",
> "14", "15", "16", "17", "18", "19", "20", "21", "22", "23", "24",
> "25", "26", "27", "28", "29", "30", "31", "32", "33", "34", "35",
> "36", "37", "38", "39", "40", "41", "42", "43", "44", "45", "46",
> "47", "48", "49", "50"), class = "data.frame")
> 
> #---fine
> > summary(test)
>       Date                 RANK
>  Min.   :2006-08-01   Min.   : 3.00
>  1st Qu.:2006-08-02   1st Qu.: 4.00
>  Median :2006-08-04   Median : 5.50
>  Mean   :2006-08-03   Mean   : 6.62
>  3rd Qu.:2006-08-05   3rd Qu.: 6.75
>  Max.   :2006-08-07   Max.   :19.00
> 
> #---isn't this supposed to work?
> > apply(test,2,mean)
> Date RANK
>   NA   NA
> Warning messages:
> 1: argument is not numeric or logical: returning NA in:
> mean.default(newX[, i], ...)
> 2: argument is not numeric or logical: returning NA in:
> mean.default(newX[, i], ...)

Look at ?apply and details. 

Argument X of apply is supposed to be an array. Details says:

     If 'X' is not an array but has a dimension attribute, 'apply'
     attempts to coerce it to an array via 'as.matrix' if it is
     two-dimensional (e.g., data frames) or via 'as.array'.

So you should look at what is happening with as.matrix():

str(as.matrix(test))
 chr [1:50, 1:2] "2006-08-01" "2006-08-01" "2006-08-01" ...
 - attr(*, "dimnames")=List of 2
  ..$ : chr [1:50] "1" "2" "3" "4" ...
  ..$ : chr [1:2] "Date" "RANK"

Notice this is now a character matrix and not what you thought it was.
So look at ?as.matrix and we see:

     'as.matrix' is a generic function. The method for data frames will
     convert any non-numeric/complex column into a character vector
     using 'format' and so return a character matrix, except that
     all-logical data frames will be coerced to a logical matrix.  When
     coercing a vector, it produces a one-column matrix, and promotes
     the names (if any) of the vector to the rownames of the matrix.

Which explains what is happening.

Workaround:

lapply(test, mean)
sapply(test, mean)

Both work

HTH,

G

> Thank you,
> b.
> 
> platform       i386-pc-mingw32
> arch           i386
> os             mingw32
> system         i386, mingw32
> status
> major          2
> minor          4.0
> year           2006
> month          10
> day            03
> svn rev        39566
> language       R
> version.string R version 2.4.0 (2006-10-03)
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Gavin Simpson                 [t] +44 (0)20 7679 0522
 ECRC, UCL Geography,          [f] +44 (0)20 7679 0565
 Pearson Building,             [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London          [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT.                 [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%



More information about the R-help mailing list