[R] help sample from large dataset - misleading error?
David Winsemius
dwinsemius at comcast.net
Sat Nov 14 03:30:40 CET 2009
On Nov 13, 2009, at 7:46 PM, Jorge Ivan Velez wrote:
> Hi Rachel,
>
> Here is a suggestion:
>
> index <- sample(100)
> mysample <- gly[index, ]
> mysample
>
I doubt that was what he was hoping to get (which would be a
permutation of 1:100 rather than a subsample).
try:
samp <- gly[ sample(nrow(gly), 100), ]
--
David
> See ?sample for more information.
>
> HTH,
> Jorge
>
>
> On Fri, Nov 13, 2009 at 5:20 PM, Hayes, Rachel M <> wrote:
>
>> Hi All,
>>
>>
>>
>> I want to take a simple random sample from a large dataset, gly,
>> but I'm
>> getting an error message. Any help?
>>
>>
>>
>> dim(gly)
>>
>> [1] 112371 37
>>
>>> s1 <- sample(gly,100)
>>
>> Error in `[.data.frame`(x, .Internal(sample(length(x), size,
>> replace, :
>>
>>
>> cannot take a sample larger than the population when 'replace =
>> FALSE'
It is not a misleading error once you consider that the length of a
data.frame, which is what you handed to sample(), is the number of
columns rather than the number of rows. Data.frames are lists.
--
David Winsemius, MD
Heritage Laboratories
West Hartford, CT
More information about the R-help
mailing list