[R] Extracting random rows from a dataset

jim holtman jholtman at gmail.com
Sun Jan 18 20:21:33 CET 2009


Here is one way to do it:

> x <- matrix(1:100,10)
> x
      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
 [1,]    1   11   21   31   41   51   61   71   81    91
 [2,]    2   12   22   32   42   52   62   72   82    92
 [3,]    3   13   23   33   43   53   63   73   83    93
 [4,]    4   14   24   34   44   54   64   74   84    94
 [5,]    5   15   25   35   45   55   65   75   85    95
 [6,]    6   16   26   36   46   56   66   76   86    96
 [7,]    7   17   27   37   47   57   67   77   87    97
 [8,]    8   18   28   38   48   58   68   78   88    98
 [9,]    9   19   29   39   49   59   69   79   89    99
[10,]   10   20   30   40   50   60   70   80   90   100
> select <- sample(nrow(x), nrow(x) * .7)
> x[select,]  # select
     [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,]    3   13   23   33   43   53   63   73   83    93
[2,]    2   12   22   32   42   52   62   72   82    92
[3,]    5   15   25   35   45   55   65   75   85    95
[4,]    9   19   29   39   49   59   69   79   89    99
[5,]    7   17   27   37   47   57   67   77   87    97
[6,]   10   20   30   40   50   60   70   80   90   100
[7,]    8   18   28   38   48   58   68   78   88    98
> x[-select,]  # testing
     [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,]    1   11   21   31   41   51   61   71   81    91
[2,]    4   14   24   34   44   54   64   74   84    94
[3,]    6   16   26   36   46   56   66   76   86    96
>


On Sun, Jan 18, 2009 at 12:35 PM, S.Putoto <rebelshop615 at gmail.com> wrote:
>
> Hello dear R Users,
>
> I am working on a dataset of 928 Enterprises, of which are observed 12
> different characters. I need to randomly sample, without repetition, 70% of
> the entreprises, to create a testing set, and let the other 30% of the
> enterprises be a validating set (holdout validation, I think that is). How
> do I do that? Of course all the characters of each row must remain together.
> Also, I am not very familiar with the R-Base language (it is the first time
> I use it) so if You could also explain to me what every function and
> argument means, it would be great help to then reiterate the procedure.
>
> Thank You very much,
>
> Sebastiano
> --
> View this message in context: http://www.nabble.com/Extracting-random-rows-from-a-dataset-tp21530539p21530539.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?




More information about the R-help mailing list