[R] Extracting random rows from a dataset

S.Putoto rebelshop615 at gmail.com
Mon Jan 19 01:23:40 CET 2009


Thank you everybody, problem solved! :)

David Winsemius wrote:
> 
> 
>  > read.table(textConnection(gsub("\\(|\\)", "", var) ))  #from prior  
> posting
>    V1 V2
> 1 p1 10
> 2 p1  3
> 3 p1  4
> 4 p2 20
> 5 p2 30
> 6 p2 40
> 7 p3  4
> 8 p3  1
> 9 p1  2
> 
>  > ridxs <- sample(1:nrow(df),floor(0.7*nrow(df)) )  # the 70% sample  
> row IDs
> 
>  > df[ridxs,]
>    V1 V2
> 5 p2 30
> 6 p2 40
> 2 p1  3
> 7 p3  4
> 4 p2 20
> 8 p3  1
>  >
>  >
>  > df[-ridxs,]
>    V1 V2
> 1 p1 10
> 3 p1  4
> 9 p1  2
> 
> The terms to pay particular attention to in the introductory material  
> are row indexing, dataframe, and negative indexing of dataframes.
> 
> 
> 
> On Jan 18, 2009, at 12:35 PM, S.Putoto wrote:
> 
>>
>> Hello dear R Users,
>>
>> I am working on a dataset of 928 Enterprises, of which are observed 12
>> different characters. I need to randomly sample, without repetition,  
>> 70% of
>> the entreprises, to create a testing set, and let the other 30% of the
>> enterprises be a validating set (holdout validation, I think that  
>> is). How
>> do I do that? Of course all the characters of each row must remain  
>> together.
>> Also, I am not very familiar with the R-Base language (it is the  
>> first time
>> I use it) so if You could also explain to me what every function and
>> argument means, it would be great help to then reiterate the  
>> procedure.
> 
> Really! Don't you that is a bit much? There are many tutorials  
> available on line. The terms to pay particular attention to in the  
> introductory material are indexing, dataframe, and negative indexing  
> of dataframes.
> 
> --
> David Winsemius
> 
>>
>>
>> Thank You very much,
>>
>> Sebastiano
>> -- 
>> View this message in context:
>> http://www.nabble.com/Extracting-random-rows-from-a-dataset-tp21530539p21530539.html
>> Sent from the R help mailing list archive at Nabble.com.
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> 

-- 
View this message in context: http://www.nabble.com/Extracting-random-rows-from-a-dataset-tp21530539p21535138.html
Sent from the R help mailing list archive at Nabble.com.




More information about the R-help mailing list