[R] sampling random groups with all observations in the group

Chuck Cleland ccleland at optonline.net
Fri Mar 2 22:46:35 CET 2007


Chuck Cleland wrote:
> Wadud, Zia wrote:
>> Hi
>> I have a panel dataset with large number of groups and differing number
>> of observations for each group. I want to randomly select say, 20% of
>> the groups or 200 groups, but along with all observations from the
>> selcted groups (with the corresponding data). 
>> I guess it is possible to generate a random sample from the groups ids
>> and then match that with the entire dataset to have the intended
>> dataset, but it sounds cumbersome and possibly there is an easier way to
>> do this? checked the package 'sampling' or command 'sample', but they
>> cant do exactly the same thing.
>> I was wondering if someone on this list will be able to share his/her
>> knowldege?
> 
>   How about something like this?
> 
> df <- data.frame(GROUP = rep(1:5, c(2,3,4,2,2)), Y = runif(13))
> 
> # Sample Two of the Five Groups
> 
> subset(df, GROUP %in% with(df, sample(unique(GROUP), 2)))

  The with() part can be dropped too.

subset(df, GROUP %in% sample(unique(GROUP), 2))

>> Thanks in advance,
>> Zia
>> **********************************************************
>> Zia Wadud
>> PhD Student
>> Centre for Transport Studies
>> Department of Civil and Environmental Engineering
>> Imperial College London
>> London SW7 2AZ
>> Tel +44 (0) 207 594 6055
>>  
>>
>> 	[[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help at stat.math.ethz.ch mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.

-- 
Chuck Cleland, Ph.D.
NDRI, Inc.
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894



More information about the R-help mailing list