[R] comparing reshape's
ivo welch
ivowel at gmail.com
Fri Jun 11 22:36:01 CEST 2010
I thought I would share the following.
System: Mac Pro 2.26GHz, OSX, 8GB of memory (not a constraint), R
2.11.0, 64bit version.
Task: I have a long data set: 2.2 million long observations (factor
xid, factor yid, variable zcontent), which I want to map into a sparse
matrix of 948 columns and 16,350 rows. There are two commonly used
functions to accomplish this:
library(stats);
outcome = reshape( subset(mydataframe, select=c(yid,xid,zcontent),
timevar="yid", idvar="xid", direction="wide") )
takes about 9,600 seconds .
library(reshape)
melted = melt( subset(mydataframe, select=c(yid,xid,zcontent),
id=c("xid", "yid") )
outcome = cast( zcontent, xid ~ yid )
takes about 875 seconds.
so, for large reshape jobs from long to wide, the reshape library is
much more efficient. YMMV.
/iaw
----
Ivo Welch (ivo.welch at brown.edu, ivo.welch at gmail.com)
More information about the R-help
mailing list