[R] sampling random groups with all observations in the group

Chuck Cleland ccleland at optonline.net
Fri Mar 2 22:26:11 CET 2007


Wadud, Zia wrote:
> Hi
> I have a panel dataset with large number of groups and differing number
> of observations for each group. I want to randomly select say, 20% of
> the groups or 200 groups, but along with all observations from the
> selcted groups (with the corresponding data). 
> I guess it is possible to generate a random sample from the groups ids
> and then match that with the entire dataset to have the intended
> dataset, but it sounds cumbersome and possibly there is an easier way to
> do this? checked the package 'sampling' or command 'sample', but they
> cant do exactly the same thing.
> I was wondering if someone on this list will be able to share his/her
> knowldege?

  How about something like this?

df <- data.frame(GROUP = rep(1:5, c(2,3,4,2,2)), Y = runif(13))

# Sample Two of the Five Groups

subset(df, GROUP %in% with(df, sample(unique(GROUP), 2)))

> Thanks in advance,
> Zia
> **********************************************************
> Zia Wadud
> PhD Student
> Centre for Transport Studies
> Department of Civil and Environmental Engineering
> Imperial College London
> London SW7 2AZ
> Tel +44 (0) 207 594 6055
>  
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Chuck Cleland, Ph.D.
NDRI, Inc.
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894



More information about the R-help mailing list