[R] How to do multi-factor stratified sampling in R
David Winsemius
dwinsemius at comcast.net
Sat Mar 8 21:54:17 CET 2008
"Robert A. LaBudde" <ral at lcfltd.com> wrote in
news:0JXF00LSO864ATE0 at vms040.mailsrvcs.net:
> Given a set of data with a number of variables plus a response, I'd
> like to obtain a randomized subset of the rows such that the
> marginal proportions of each variable are maintained closely in the
> subset to that of the dataset, and possibly maintaining as well the
> two-factor interaction marginal proportions as well for some pairs.
>
> This must be a common problem in data mining, but I don't seem to be
> able to locate the proper library or function for doing this in R.
>
> Thanks for any help.
Have you looked at the "sampling" package? I have never used it, but the
strata() function appears to be capable.
--
David Winsemius
More information about the R-help
mailing list