[R] vectorizing: selecting one record per group
Erik Iverson
eriki at ccbr.umn.edu
Wed Oct 13 22:17:16 CEST 2010
Hello,
There are probably many ways to do this, but I think
it's easier if you use a data.frame as your object.
The easy solution for the matrix you provide is escaping
me at the moment.
One solution, using the plyr package:
library(plyr)
A <- data.frame(a = rnorm(100),b = runif(100), c = rep(c(1,2,3,4,5),20))
ddply(A, .(c), function(x) x[sample(1:nrow(x), 1), ])
a b c
1 0.02995847 0.4763819 1
2 0.72035194 0.2948611 2
3 1.34963917 0.2057488 3
4 -1.99427160 0.1147923 4
5 -0.73612703 0.5889539 5
Mauricio Romero wrote:
> Hi,
>
>
>
> I want to select a subsample from my data, choosing one record from each
> group. I know how to do this with a for.
>
>
>
> For example: lets say I have the data:
>
> A=cbind(rnorm(100),runif(100),(rep(c(1,2,3,4,5),20)))
>
> Where the third column is the group variable. Then what I want is to select
> 5 observations. Each one taken randomly from each group.
>
>
>
>
>
> INDEX =NULL
>
> i=1
>
> for(index_g in unique(A[,3])){
>
> INDEX [i]=sample(which(A[,3]==index_g),1)
>
> i=i+1
>
> }
>
> SEL=A[INDEX,]
>
>
>
>
>
> Is there a way to do this without a “for”? in other words is there a way to
> “vectorize” this?
>
>
>
> Thank you,
>
>
>
>
>
> Mauricio Romero
>
> Quantil S.A.S.
>
> Bogotá,Colombia
>
> www.quantil.com.co
>
>
>
> "It is from the earth that we must find our substance; it is on the earth
> that we must find solutions to the problems that promise to destroy all life
> here"
>
>
>
>
> [[alternative HTML version deleted]]
>
>
>
> ------------------------------------------------------------------------
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list