[R] Select
Jeff Newmiller
jdnewm|| @end|ng |rom dcn@d@v|@@c@@u@
Tue Feb 12 01:26:17 CET 2019
N <- 8 # however many times you want to do this
ans <- lapply( seq.int( N )
, function( n ) {
idx <- sample( nrow( mydat ) )
mydat[ idx[ seq.int( which( 40 < cumsum( mydat[ idx, "count" ] ) )[ 1 ] ) ], ]
}
)
On Mon, 11 Feb 2019, Val wrote:
> Sorry Jeff and David for not being clear!
>
> The total sample size should be at least 40, but the selection should
> be based on group ID. A different combination of Group ID could give
> at least 40.
> If I select group G1 with 25 count and G2 and with 15 counts
> then I can get a minimum of 40 counts. So G1 and G2 are
> selected.
> G1 25
> G2 15
>
> In another scenario, if G2, G3 and G4 are selected then the total
> count will be 58 which is greater than 40. So G2 , G3 and G4 could
> be selected.
> G2 15
> G3 12
> G4 31
>
> So the restriction is to find group IDs that give a minim of 40.
> Once, I reached a minim of 40 then stop selecting group and output
> the data..
>
> I am hope this helps
>
>
>
>
> On Mon, Feb 11, 2019 at 5:09 PM Jeff Newmiller <jdnewmil using dcn.davis.ca.us> wrote:
>>
>> This constraint was not clear in your original sample data set. Can you expand the data set to clarify how this requirement REALLY works?
>>
>> On February 11, 2019 3:00:15 PM PST, Val <valkremk using gmail.com> wrote:
>>> Thank you David.
>>>
>>> However, this will not work for me. If the group ID selected then all
>>> of its observation should be included.
>>>
>>> On Mon, Feb 11, 2019 at 4:51 PM David L Carlson <dcarlson using tamu.edu>
>>> wrote:
>>>>
>>>> First expand your data frame into a vector where G1 is repeated 25
>>> times, G2 is repeated 15 times, etc. Then draw random samples of 40
>>> from that vector:
>>>>
>>>>> grp <- rep(mydat$group, mydat$count)
>>>>> grp.sam <- sample(grp, 40)
>>>>> table(grp.sam)
>>>> grp.sam
>>>> G1 G2 G3 G4 G5
>>>> 10 9 5 13 3
>>>>
>>>> ----------------------------------------
>>>> David L Carlson
>>>> Department of Anthropology
>>>> Texas A&M University
>>>> College Station, TX 77843-4352
>>>>
>>>>
>>>> -----Original Message-----
>>>> From: R-help <r-help-bounces using r-project.org> On Behalf Of Val
>>>> Sent: Monday, February 11, 2019 4:36 PM
>>>> To: r-help using R-project.org (r-help using r-project.org)
>>> <r-help using r-project.org>
>>>> Subject: [R] Select
>>>>
>>>> Hi all,
>>>>
>>>> I have a data frame with tow variables group and its size.
>>>> mydat<- read.table( text='group count
>>>> G1 25
>>>> G2 15
>>>> G3 12
>>>> G4 31
>>>> G5 10' , header = TRUE, as.is = TRUE )
>>>>
>>>> I want to select group ID randomly (without replacement) until
>>> the
>>>> sum of count reaches 40.
>>>> So, in the first case, the data frame could be
>>>> G4 31
>>>> 65 10
>>>>
>>>> In other case, it could be
>>>> G5 10
>>>> G2 15
>>>> G3 12
>>>>
>>>> How do I put sum of count variable is a minimum of 40 restriction?
>>>>
>>>> Than k you in advance
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> I want to select group ids randomly until I reach the
>>>>
>>>> ______________________________________________
>>>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>> ______________________________________________
>>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>> --
>> Sent from my phone. Please excuse my brevity.
>
---------------------------------------------------------------------------
Jeff Newmiller The ..... ..... Go Live...
DCN:<jdnewmil using dcn.davis.ca.us> Basics: ##.#. ##.#. Live Go...
Live: OO#.. Dead: OO#.. Playing
Research Engineer (Solar/Batteries O.O#. #.O#. with
/Software/Embedded Controllers) .OO#. .OO#. rocks...1k
More information about the R-help
mailing list