[R] Sampling problems
David Winsemius
dwinsemius at comcast.net
Wed Mar 7 21:24:52 CET 2012
On Mar 7, 2012, at 11:41 AM, Oritteropus wrote:
> Hi,
> I need to sample randomly my dataset for 1000 times. The sample need
> to be
> the 80%. I know how to do that, my problem is that not only I need
> the 80%,
> but I also need the corresponding 20% each time. Is there any way to
> do
> that?
> Alternatively, I was thinking to something like setdiff () function to
> compare my 80% sample to the original dataset and obtain the
> corresponding
> 20%, unfortunately setdiff works just for vectors, do you know a
> similar
> function for dataframes?
Create an index vector with runif or sample and then use that to get
you sample and use negative indexing to get the remainder.
idx <- sample(1:1000, 800)
x[ idx, ] # 80%
x[ -idx, ] # the other 20%
(I think this does presume you have not mucked with the default
rownames.)
--
David Winsemius, MD
West Hartford, CT
More information about the R-help
mailing list