[R] Extracting random rows from a dataset
jim holtman
jholtman at gmail.com
Sun Jan 18 20:21:33 CET 2009
Here is one way to do it:
> x <- matrix(1:100,10)
> x
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 1 11 21 31 41 51 61 71 81 91
[2,] 2 12 22 32 42 52 62 72 82 92
[3,] 3 13 23 33 43 53 63 73 83 93
[4,] 4 14 24 34 44 54 64 74 84 94
[5,] 5 15 25 35 45 55 65 75 85 95
[6,] 6 16 26 36 46 56 66 76 86 96
[7,] 7 17 27 37 47 57 67 77 87 97
[8,] 8 18 28 38 48 58 68 78 88 98
[9,] 9 19 29 39 49 59 69 79 89 99
[10,] 10 20 30 40 50 60 70 80 90 100
> select <- sample(nrow(x), nrow(x) * .7)
> x[select,] # select
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 3 13 23 33 43 53 63 73 83 93
[2,] 2 12 22 32 42 52 62 72 82 92
[3,] 5 15 25 35 45 55 65 75 85 95
[4,] 9 19 29 39 49 59 69 79 89 99
[5,] 7 17 27 37 47 57 67 77 87 97
[6,] 10 20 30 40 50 60 70 80 90 100
[7,] 8 18 28 38 48 58 68 78 88 98
> x[-select,] # testing
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 1 11 21 31 41 51 61 71 81 91
[2,] 4 14 24 34 44 54 64 74 84 94
[3,] 6 16 26 36 46 56 66 76 86 96
>
On Sun, Jan 18, 2009 at 12:35 PM, S.Putoto <rebelshop615 at gmail.com> wrote:
>
> Hello dear R Users,
>
> I am working on a dataset of 928 Enterprises, of which are observed 12
> different characters. I need to randomly sample, without repetition, 70% of
> the entreprises, to create a testing set, and let the other 30% of the
> enterprises be a validating set (holdout validation, I think that is). How
> do I do that? Of course all the characters of each row must remain together.
> Also, I am not very familiar with the R-Base language (it is the first time
> I use it) so if You could also explain to me what every function and
> argument means, it would be great help to then reiterate the procedure.
>
> Thank You very much,
>
> Sebastiano
> --
> View this message in context: http://www.nabble.com/Extracting-random-rows-from-a-dataset-tp21530539p21530539.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
--
Jim Holtman
Cincinnati, OH
+1 513 646 9390
What is the problem that you are trying to solve?
More information about the R-help
mailing list