[R] Restructuring data

hadley wickham h.wickham at gmail.com
Mon Jul 16 07:48:20 CEST 2007


On 7/16/07, deepayan.sarkar at gmail.com <deepayan.sarkar at gmail.com> wrote:
> On 7/15/07, Daniel Malter <daniel at umd.edu> wrote:
> > Hi folks,
> >
> > I am new to the list and relatively new to R. I am trying to unstack data
> > "arraywise" and could not find a convenient solution yet. I tried to find a
> > solution for the problem on help archives. I also tried to use the reshape
> > command in R and the reshape package but could not get result. I will
> > illustrate the case below, but the real dataset is quite large so that I
> > would appreciate an easy solution if there is any.
> >
> > The current data structure (variable names):
> >
> > ID, TIME, BUY-A, BUY-B, SELL-A, SELL-B
> >
> > Achieved structure (with the reshape command or the reshape package)
> >
> > ID, TIME, BUY-A
> > ID, TIME, BUY-B
> > ID, TIME, SELL-A
> > ID, TIME, SELL-B
> >
> > This is regular unstacking with two identifier variables. Nothing special
> > though. What I am looking for and did not manage is the following structure:
> >
> > ID, TIME, BUY-A, SELL-A
> > ID, TIME, BUY-B, SELL-B
> >
> > I am quite sure it's pretty easy, but I could not find how to do this.
>
> This seems to work:
>
> > foo <- data.frame(ID = 1:4, TIME=1:4,
> +                   "BUY-A" = rnorm(4),
> +                   "BUY-B" = rnorm(4),
> +                   "SELL-A" = rnorm(4),
> +                   "SELL-B" = rnorm(4), check.names = FALSE)
> >
> >
> > foo
>   ID TIME       BUY-A      BUY-B     SELL-A      SELL-B
> 1  1    1  0.47022807 1.09573107  0.1977035 -0.08333043
> 2  2    2 -0.20672870 0.07397772  1.4959044 -0.98555020
> 3  3    3  0.05533779 0.25821758  1.3531913  0.16808307
> 4  4    4 -0.11471772 1.27798740 -0.1101390 -0.36937994
> >
> > reshape(foo, direction="long",
> +         varying = list(c("BUY-A", "BUY-B"), c("SELL-A", "SELL-B")),
> +         v.names=c("BUY", "SELL"), idvar="ID",
> +         times = c("A", "B"), timevar="which")
>     ID TIME which         BUY        SELL
> 1.A  1    1     A  0.47022807  0.19770349
> 2.A  2    2     A -0.20672870  1.49590443
> 3.A  3    3     A  0.05533779  1.35319133
> 4.A  4    4     A -0.11471772 -0.11013896
> 1.B  1    1     B  1.09573107 -0.08333043
> 2.B  2    2     B  0.07397772 -0.98555020
> 3.B  3    3     B  0.25821758  0.16808307
> 4.B  4    4     B  1.27798740 -0.36937994

It's a little more verbose with the reshape package, but I find it
easier to understand what's going on.

fm <- melt(foo, id=c("ID","TIME"))
fm <- cbind(fm, colsplit(fm$variable, "-", c("direction","type")))
fm$variable <- NULL

cast(fm, ... ~ direction)

There's an example like this in the introduction to reshape manual.

Hadley



More information about the R-help mailing list