[R] Odd behaviour of mean() with a numeric column in a tibble

Chris Evans chrishold at psyctc.org
Tue Dec 6 22:26:29 CET 2016


I hope I am obeying the list rules here. I am using a raw R IDE for this and running 3.3.2 (2016-10-31) on x86_64-w64-mingw32/x64 (64-bit)

Here is a reproducible example.  Code only first

require(tibble)
tmpTibble <- tibble(ID=letters,num=1:26)
min(tmpTibble[,2]) # fine
max(tmpTibble[,2]) # fine
median(tmpTibble[,2])  # not fine
mean(tmpTibble[,2])    # not fine
newMeanFun <- function(x) {mean(as.numeric(unlist(x)))}
newMeanFun(tmpTibble[,2]) # solved problem but surely shouldn't be necessary?!
newMedianFun <- function(x) {median(as.numeric(unlist(x)))}
newMedianFun(tmpTibble[,2]) # ditto
str(tmpTibble[,2])

### then I tried this to make sure it wasn't about having fed in integers

tmpTibble2 <- tibble(ID=letters,num=1:26,num2=(1:26)/10)
tmpTibble2
mean(tmpTibble2[,3]) # not fine, not about integers!


### before I just created tmpTibble2 I found myself trying to add a column to tmpTibble
tmpTibble$newNum <- tmpTibble[,2]/10  # NO!
tmpTibble[["newNum"]] <- tmpTibble[,2]/10 # NO!
### and oddly enough ...
add_column(tmpTibble,newNum = tmpTibble[,2]/10) # NO!

Now here it is with the output:

> require(tibble)
Loading required package: tibble
> tmpTibble <- tibble(ID=letters,num=1:26)
> min(tmpTibble[,2]) # fine
[1] 1
> max(tmpTibble[,2]) # fine
[1] 26
> median(tmpTibble[,2])  # not fine
Error in median.default(tmpTibble[, 2]) : need numeric data
> mean(tmpTibble[,2])    # not fine
[1] NA
Warning message:
In mean.default(tmpTibble[, 2]) :
  argument is not numeric or logical: returning NA
> newMeanFun <- function(x) {mean(as.numeric(unlist(x)))}
> newMeanFun(tmpTibble[,2]) # solved problem but surely shouldn't be necessary?!
[1] 13.5
> newMedianFun <- function(x) {median(as.numeric(unlist(x)))}
> newMedianFun(tmpTibble[,2]) # ditto
[1] 13.5
> str(tmpTibble[,2])
Classes ‘tbl_df’, ‘tbl’ and 'data.frame':       26 obs. of  1 variable:
 $ num: int  1 2 3 4 5 6 7 8 9 10 ...
> 
> ### then I tried this to make sure it wasn't about having fed in integers
> 
> tmpTibble2 <- tibble(ID=letters,num=1:26,num2=(1:26)/10)
> tmpTibble2
# A tibble: 26 × 3
      ID   num  num2
   <chr> <int> <dbl>
1      a     1   0.1
2      b     2   0.2
3      c     3   0.3
4      d     4   0.4
5      e     5   0.5
6      f     6   0.6
7      g     7   0.7
8      h     8   0.8
9      i     9   0.9
10     j    10   1.0
# ... with 16 more rows
> mean(tmpTibble2[,3]) # not fine, not about integers!
[1] NA
Warning message:
In mean.default(tmpTibble2[, 3]) :
  argument is not numeric or logical: returning NA
> 
> 
> ### before I just created tmpTibble2 I found myself trying to add a column to tmpTibble
> tmpTibble$newNum <- tmpTibble[,2]/10  # NO!
> tmpTibble[["newNum"]] <- tmpTibble[,2]/10 # NO!
> ### and oddly enough ...
> add_column(tmpTibble,newNum = tmpTibble[,2]/10) # NO!
Error: Each variable must be a 1d atomic vector or list.
Problem variables: 'newNum'
> 
> 

I discovered this when I hit odd behaviour after using read_spss() from the haven package for the first time as it seemed to be offering a step forward over good old read.spss() from the excellent foreign package.  I am reporting it here not directly to Prof. Wickham as the issues seem rather general though I'm guessing that it needs to be fixed with a fix to tibble.   Or perhaps I've completely missed something.

TIA,

Chris



More information about the R-help mailing list