[R] Sorting and subsetting
Matthew Dowle
mdowle at mdowle.plus.com
Tue Sep 21 12:09:42 CEST 2010
All the solutions in this thread so far use the lapply(split(...)) paradigm
either directly or indirectly. That paradigm doesn't scale. That's the
likely
source of quite a few 'out of memory' errors and performance issues in R.
data.table doesn't do that internally, and it's syntax is pretty easy.
> tmp <- data.table(index = gl(2,20), foo = rnorm(40))
> tmp[, .SD[head(order(-foo),5)], by=index]
index index.1 foo
[1,] 1 1 1.9677303
[2,] 1 1 1.2731872
[3,] 1 1 1.1100931
[4,] 1 1 0.8194719
[5,] 1 1 0.6674880
[6,] 2 2 1.2236383
[7,] 2 2 0.9606766
[8,] 2 2 0.8654497
[9,] 2 2 0.5404112
[10,] 2 2 0.3373457
>
As you can see it currently repeats the group column which is a
shame (on the to do list to fix).
Matthew
http://datatable.r-forge.r-project.org/
--
View this message in context: http://r.789695.n4.nabble.com/Sorting-and-subsetting-tp2547360p2548319.html
Sent from the R help mailing list archive at Nabble.com.
More information about the R-help
mailing list