[R] : how to select rows at random

Gavin Simpson gavin.simpson at ucl.ac.uk
Fri Mar 27 20:21:13 CET 2009


On Fri, 2009-03-27 at 15:11 -0400, Laura Rodriguez Murillo wrote:
> Hi dear list,
> 
> I have a list of around 2000 identifiers aranged in a dataframe in one
> column and I would like to choose a random subset of these. I wonder
> if somebody can tell me if I could do this with R...

Not sure what you mean by identifiers, but to select a subset of the
2000 cells in that column, you could use sample(). See ?sample for
details, but here is an example.

## choose a random subset of 500 out of 2000 entries
## dummy data
dat <- data.frame(identifiers = sample(2000, 2000), X = rnorm(2000))
## set seed to make this the same on your PC as mine
## comment this if you want a different subset each time you run
set.seed(1234)
## random subset of 500
want <- sample(2000, 500)
## select out that subset
## head to show only first n of the selected
head(dat$identifiers[want])

Gives:

> head(dat$identifiers[want])
[1] 1327  587  835  430 1422 1687

This assumes the identifiers are unique.

HTH

G

> 
> Thank you so much!
> 
> Laura RM
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Dr. Gavin Simpson             [t] +44 (0)20 7679 0522
 ECRC, UCL Geography,          [f] +44 (0)20 7679 0565
 Pearson Building,             [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London          [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT.                 [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%




More information about the R-help mailing list