[R] sampling

David Winsemius dwinsemius at comcast.net
Thu Feb 17 14:45:24 CET 2011


On Feb 16, 2011, at 11:35 PM, yf wrote:

>
> I want to sample from the ID. For each ID, i want to have 2 set of  
> data. I
> try the sample() function but it didn't work.

You don't say _how_ you used the sample function. You should show what  
code you used when stating the _something_ "doesn't work".

Sample returns a vector of items from objects where length()  
represents some sensible notion. It does not "sample" a complex object  
such as a dataframe. For dataframes, length is the number of columns,  
which doesn't agree very well with most people's notion of cases from  
which to sample.  For selection of rows of a dataframes you need to  
first create a vector of numeric indices and then use that with "["

idx <- sample(nrow(x), nrow(x)/2)
# A random split
x[  idx, ]
x[ -idx, ]

>
>> x<-data.frame(id=c(1,1,1,2,2,2,2,3,3,3,4,4), v1=c(1:12), V2=c(12:23))
>> x
>   id v1 V2
> 1   1  1 12
> 2   1  2 13
> 3   1  3 14
> 4   2  4 15
> 5   2  5 16
> 6   2  6 17
> 7   2  7 18
> 8   3  8 19
> 9   3  9 20
> 10  3 10 21
> 11  4 11 22
> 12  4 12 23
> -- 
> View this message in context: http://r.789695.n4.nabble.com/sampling-tp3310184p3310184.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT



More information about the R-help mailing list