[R] sample train and test data using dplyr
Ulrik Stervbo
ulrik.stervbo at gmail.com
Fri Dec 9 07:42:56 CET 2016
df <- data.frame(x = 1:12, y = rnorm(12))
If you use sample:
RowIndex <- sample(1:nrow(df), 5)
TrainSet <- df[RowIndex, ]
TestSet <- df[-RowIndex, ]
Or with dplyr:
TrainSet <- sample_n(df, 5)
TestSet <- anti_join(TestSet, df)
HTH
Ulrik
On Fri, 9 Dec 2016, 06:56 Partha Sinha, <pnsinha68 at gmail.com> wrote:
> How to get two sets of non overlapping data?
> Regards
> Parth
>
> On 8 December 2016 at 23:23, Ulrik Stervbo <ulrik.stervbo at gmail.com>
> wrote:
>
> In addition to 'sample', and if you insist on dplyr, you can use
> 'sample_n'.
>
> Best,
> Ulrik
>
> On Thu, 8 Dec 2016 at 18:47 Bert Gunter <bgunter.4567 at gmail.com> wrote:
>
> Usually we expect posters to do their homework by reading necessary R
> documentation and relevant subject matter resources (e.g. on
> clustering) and making a serious attempt to solve the problem by
> offering their code to us along as part of a reproducible example of
> how it failed. You have done none of these things, and so you may not
> receive a helpful reply -- or maybe some kind soul will offer one.
>
> I am not such a kind soul. However I will tell you that ?sample is
> probably relevant and that you should read and follow the posting
> guide at the foot of this email to post a coherent query, which, IMO,
> yours is not.
>
> Cheers,
> Bert
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along
> and sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>
> On Thu, Dec 8, 2016 at 8:57 AM, Partha Sinha <pnsinha68 at gmail.com> wrote:
> > I want to create two files train and test using dplyr (by random sampling
> > method). How to do the same same using lets say iris data.
> > Regards
> > Parth
> >
> > [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
>
[[alternative HTML version deleted]]
More information about the R-help
mailing list