[R] Randomly remove condition-selected rows from a matrix

Stavros Macrakis macrakis at alum.mit.edu
Fri Jan 2 20:54:38 CET 2009


There is another undocumented glitch in sample:

     sample(2^31-1,1) => OK
     sample(2^31 ,1) => Error

I suppose you could interpret "sampling takes place from '1:x' " to
mean that 1:x is actually generated, but that doesn't work as an
explanation either; on my 32-bit Windows box, 1:(2^29) gives an error,
but sample(2^29,1) works fine.

          -s

On Fri, Jan 2, 2009 at 2:18 PM, Wacek Kusnierczyk
<Waclaw.Marcin.Kusnierczyk at idi.ntnu.no> wrote:
> xxx wrote:
>> On Fri, Jan 2, 2009 at 10:07 AM, Wacek Kusnierczyk
>> <Waclaw.Marcin.Kusnierczyk at idi.ntnu.no> wrote:
>>
>>> ...     'sample' takes a sample of the specified size from the elements of
>>>     'x' using either with or without replacement.
>>>
>>>       x: Either a (numeric, complex, character or logical) vector of
>>>          more than one element from which to choose, or a positive
>>>          integer.
>>>
>>>    If 'x' has length 1, is numeric (in the sense of 'is.numeric') and
>>>     'x >= 1', sampling takes place from '1:x'.  _Note_ that this
>>>     convenience feature may lead to undesired behaviour when 'x' is of
>>>     varying length 'sample(x)'.  See the 'resample()' example below.
>>> ...
>>> yet the following works, even though x has length 1 and is *not* numeric:...
>>> is this a bug in the code, or a bug in the documentation?
>>>
>>
>> I would guess it's a bug in the documentation.
>>
>>
>
> possibly.  looking at the r code for sample, it's clear why
> sample("foo") works:
>
> function (x, size, replace = FALSE, prob = NULL)
> {
>    if (length(x) == 1 && is.numeric(x) && x >= 1) {
>        if (missing(size))
>            size <- x
>        .Internal(sample(x, size, replace, prob))
>    }
>    else {
>        if (missing(size))
>            size <- length(x)
>        x[.Internal(sample(length(x), size, replace, prob))]
>    }
> }
>
> what is also clear from the code is that the function has another,
> supposedly buggy behaviour due to the smart behaviour of the : operator:
>
> sample(1.1)
> # 1, not 1.1
>
> this is consistent with
>
> "
>     If 'x' has length 1, is numeric (in the sense of 'is.numeric') and
>     'x >= 1', sampling takes place from '1:x'.
> "
>
> due to the downcast performed by the colon operator, but not with
>
> "
>       x: Either a (numeric, complex, character or logical) vector of
>          more than one element from which to choose, or a positive
>          integer.
> "
>
> both from ?sample.  tfm is seemingly wrong wrt. the implementation, and
> i find sample(1.1) returning 1 a design flaw.  (i guess the note "_Note_
> that this convenience feature may lead to undesired behaviour when 'x'
> is of varying length 'sample(x)'." is supposed to explain away such cases.)
>
> vQ
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>




More information about the R-help mailing list