[R] sampling random groups with all observations in the group
Chuck Cleland
ccleland at optonline.net
Fri Mar 2 22:46:35 CET 2007
Chuck Cleland wrote:
> Wadud, Zia wrote:
>> Hi
>> I have a panel dataset with large number of groups and differing number
>> of observations for each group. I want to randomly select say, 20% of
>> the groups or 200 groups, but along with all observations from the
>> selcted groups (with the corresponding data).
>> I guess it is possible to generate a random sample from the groups ids
>> and then match that with the entire dataset to have the intended
>> dataset, but it sounds cumbersome and possibly there is an easier way to
>> do this? checked the package 'sampling' or command 'sample', but they
>> cant do exactly the same thing.
>> I was wondering if someone on this list will be able to share his/her
>> knowldege?
>
> How about something like this?
>
> df <- data.frame(GROUP = rep(1:5, c(2,3,4,2,2)), Y = runif(13))
>
> # Sample Two of the Five Groups
>
> subset(df, GROUP %in% with(df, sample(unique(GROUP), 2)))
The with() part can be dropped too.
subset(df, GROUP %in% sample(unique(GROUP), 2))
>> Thanks in advance,
>> Zia
>> **********************************************************
>> Zia Wadud
>> PhD Student
>> Centre for Transport Studies
>> Department of Civil and Environmental Engineering
>> Imperial College London
>> London SW7 2AZ
>> Tel +44 (0) 207 594 6055
>>
>>
>> [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help at stat.math.ethz.ch mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
--
Chuck Cleland, Ph.D.
NDRI, Inc.
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894
More information about the R-help
mailing list