[R] Subsampling-oversampling from a data frame
B77S
bps0002 at auburn.edu
Wed Nov 2 00:06:48 CET 2011
If no one has a better solution, split it, take a sample of size X from both
and put it back together.
hgwelec wrote:
>
> Dear members,
>
> Consider the following data frame (first 4 rows shown)
>
>
> age sex class
> 15 m low
> 20 f high
> 15 f low
> 10 m low
>
> in my original data set i have 1200 rows and a class distribution of
> low=0.3 and high=0.7
>
>
> My question : how can i create a new data frame as the one shown above but
> with the 'high' class subsampled so that in the new data frame the class
> distribution is low=0.5 and high=0.5?
>
> I tried looking at the sample function and prob option but all examples i
> seen do not use an imbalanced class problem as the one shown above
>
>
> Thank you in advance
>
>
> Thank you in advance
>
--
View this message in context: http://r.789695.n4.nabble.com/Subsampling-oversampling-from-a-data-frame-tp3965771p3965827.html
Sent from the R help mailing list archive at Nabble.com.
More information about the R-help
mailing list