[R] help sample from large dataset - misleading error?

David Winsemius dwinsemius at comcast.net
Sat Nov 14 03:30:40 CET 2009


On Nov 13, 2009, at 7:46 PM, Jorge Ivan Velez wrote:

> Hi Rachel,
>
> Here is a suggestion:
>
> index <- sample(100)
> mysample <- gly[index, ]
> mysample
>

I doubt that was what he was hoping to get (which would be a  
permutation of 1:100 rather than a subsample).

try:
samp <- gly[ sample(nrow(gly), 100), ]

-- 
David

> See ?sample for more information.
>
> HTH,
> Jorge
>
>
> On Fri, Nov 13, 2009 at 5:20 PM, Hayes, Rachel M <> wrote:
>
>> Hi All,
>>
>>
>>
>> I want to take a simple random sample from a large dataset, gly,  
>> but I'm
>> getting an error message.  Any help?
>>
>>
>>
>> dim(gly)
>>
>> [1] 112371     37
>>
>>> s1 <- sample(gly,100)
>>
>> Error in `[.data.frame`(x, .Internal(sample(length(x), size,  
>> replace,  :
>>
>>
>> cannot take a sample larger than the population when 'replace =  
>> FALSE'

It is not a misleading error once you consider that the length of a  
data.frame, which is what you handed to sample(), is the number of  
columns rather than the number of rows. Data.frames are lists.

-- 

David Winsemius, MD
Heritage Laboratories
West Hartford, CT




More information about the R-help mailing list