[R] Randomly remove condition-selected rows from a matrix
Wacek Kusnierczyk
Waclaw.Marcin.Kusnierczyk at idi.ntnu.no
Fri Jan 2 20:18:33 CET 2009
xxx wrote:
> On Fri, Jan 2, 2009 at 10:07 AM, Wacek Kusnierczyk
> <Waclaw.Marcin.Kusnierczyk at idi.ntnu.no> wrote:
>
>> ... 'sample' takes a sample of the specified size from the elements of
>> 'x' using either with or without replacement.
>>
>> x: Either a (numeric, complex, character or logical) vector of
>> more than one element from which to choose, or a positive
>> integer.
>>
>> If 'x' has length 1, is numeric (in the sense of 'is.numeric') and
>> 'x >= 1', sampling takes place from '1:x'. _Note_ that this
>> convenience feature may lead to undesired behaviour when 'x' is of
>> varying length 'sample(x)'. See the 'resample()' example below.
>> ...
>> yet the following works, even though x has length 1 and is *not* numeric:...
>> is this a bug in the code, or a bug in the documentation?
>>
>
> I would guess it's a bug in the documentation.
>
>
possibly. looking at the r code for sample, it's clear why
sample("foo") works:
function (x, size, replace = FALSE, prob = NULL)
{
if (length(x) == 1 && is.numeric(x) && x >= 1) {
if (missing(size))
size <- x
.Internal(sample(x, size, replace, prob))
}
else {
if (missing(size))
size <- length(x)
x[.Internal(sample(length(x), size, replace, prob))]
}
}
what is also clear from the code is that the function has another,
supposedly buggy behaviour due to the smart behaviour of the : operator:
sample(1.1)
# 1, not 1.1
this is consistent with
"
If 'x' has length 1, is numeric (in the sense of 'is.numeric') and
'x >= 1', sampling takes place from '1:x'.
"
due to the downcast performed by the colon operator, but not with
"
x: Either a (numeric, complex, character or logical) vector of
more than one element from which to choose, or a positive
integer.
"
both from ?sample. tfm is seemingly wrong wrt. the implementation, and
i find sample(1.1) returning 1 a design flaw. (i guess the note "_Note_
that this convenience feature may lead to undesired behaviour when 'x'
is of varying length 'sample(x)'." is supposed to explain away such cases.)
vQ
More information about the R-help
mailing list