[R] Unexpected behavior of "apply" when FUN=sample

(Ted Harding) Ted.Harding at wlandres.net
Tue May 14 12:07:50 CEST 2013


On 14-May-2013 09:46:32 Duncan Murdoch wrote:
> On 13-05-14 4:52 AM, Luca Nanetti wrote:
>> Dear experts,
>>
>> I wanted to signal a peculiar, unexpected behaviour of 'apply'.
>> It is not a bug, it is per spec, but it is so counterintuitive
>> that I thought it could be interesting.
>>
>> I have an array, let's say "test", dim=c(7,5).
>>
>>> test <- array(1:35, dim=c(7, 5))
>>> test
>>
>>       [,1] [,2] [,3] [,4] [,5]
>> [1,]    1    8   15   22   29
>> [2,]    2    9   16   23   30
>> [3,]    3   10   17   24   31
>> [4,]    4   11   18   25   32
>> [5,]    5   12   19   26   33
>> [6,]    6   13   20   27   34
>> [7,]    7   14   21   28   35
>>
>> I want a new array where the content of the rows (columns) are
>> permuted, differently per row (per column)
>>
>> Let's start with the columns, i.e. the second MARGIN of the array:
>>> test.m2 <- apply(test, 2, sample)
>>> test.m2
>>
>>       [,1] [,2] [,3] [,4] [,5]
>> [1,]    1   10   18   23   32
>> [2,]    7    9   16   25   30
>> [3,]    6   14   17   22   33
>> [4,]    4   11   15   24   34
>> [5,]    2   12   21   28   31
>> [6,]    5    8   20   26   29
>> [7,]    3   13   19   27   35
>>
>> perfect. That was exactly what I wanted: the content of each column is
>> shuffled, and differently for each column.
>> However, if I use the same with the rows (MARGIIN = 1), the output is
>> transposed!
>>
>>> test.m1 <- apply(test, 1, sample)
>>> test.m1
>>
>>       [,1] [,2] [,3] [,4] [,5] [,6] [,7]
>> [1,]    1    2    3    4    5   13   21
>> [2,]   22   30   17   18   19   20   35
>> [3,]   15   23   24   32   26   27   14
>> [4,]   29   16   31   25   33   34   28
>> [5,]    8    9   10   11   12    6    7
>>
>> In other words, I wanted to permute the content of the rows of "test", and
>> I expected to see in the output, well, the shuffled rows as rows, not as
>> column!
>>
>> I would respectfully suggest to make this behavior more explicit in the
>> documentation.
> 
> It's is already very explicit:  "If each call to FUN returns a vector of 
> length n, then apply returns an array of dimension c(n, dim(X)[MARGIN]) 
> if n > 1."  In your first case, sample is applied to columns, and 
> returns length 7 results, so the shape of the final result is c(7, 5). 
> In the second case it is applied to rows, and returns length 5 results, 
> so the shape is c(5, 7).
> 
> Duncan Murdoch

And the (quite simple) practical implication of what Duncan points out is:

  test <- array(1:35, dim=c(7, 5))
  test
  #      [,1] [,2] [,3] [,4] [,5]
  # [1,]    1    8   15   22   29
  # [2,]    2    9   16   23   30
  # [3,]    3   10   17   24   31
  # [4,]    4   11   18   25   32
  # [5,]    5   12   19   26   33
  # [6,]    6   13   20   27   34
  # [7,]    7   14   21   28   35

# To permute the rows:
  t(apply(t(test), 2, sample))
  #      [,1] [,2] [,3] [,4] [,5]
  # [1,]   22   29    8   15    1
  # [2,]   30   16   23    2    9
  # [3,]   10   31   24    3   17
  # [4,]   11    4   25   32   18
  # [5,]   26    5   12   33   19
  # [6,]   27   34   20   13    6
  # [7,]   35   28   14    7   21

which looks right!
Ted.

-------------------------------------------------
E-Mail: (Ted Harding) <Ted.Harding at wlandres.net>
Date: 14-May-2013  Time: 11:07:46
This message was sent by XFMail



More information about the R-help mailing list