[R] Restructuring data
hadley wickham
h.wickham at gmail.com
Mon Jul 16 07:48:20 CEST 2007
On 7/16/07, deepayan.sarkar at gmail.com <deepayan.sarkar at gmail.com> wrote:
> On 7/15/07, Daniel Malter <daniel at umd.edu> wrote:
> > Hi folks,
> >
> > I am new to the list and relatively new to R. I am trying to unstack data
> > "arraywise" and could not find a convenient solution yet. I tried to find a
> > solution for the problem on help archives. I also tried to use the reshape
> > command in R and the reshape package but could not get result. I will
> > illustrate the case below, but the real dataset is quite large so that I
> > would appreciate an easy solution if there is any.
> >
> > The current data structure (variable names):
> >
> > ID, TIME, BUY-A, BUY-B, SELL-A, SELL-B
> >
> > Achieved structure (with the reshape command or the reshape package)
> >
> > ID, TIME, BUY-A
> > ID, TIME, BUY-B
> > ID, TIME, SELL-A
> > ID, TIME, SELL-B
> >
> > This is regular unstacking with two identifier variables. Nothing special
> > though. What I am looking for and did not manage is the following structure:
> >
> > ID, TIME, BUY-A, SELL-A
> > ID, TIME, BUY-B, SELL-B
> >
> > I am quite sure it's pretty easy, but I could not find how to do this.
>
> This seems to work:
>
> > foo <- data.frame(ID = 1:4, TIME=1:4,
> + "BUY-A" = rnorm(4),
> + "BUY-B" = rnorm(4),
> + "SELL-A" = rnorm(4),
> + "SELL-B" = rnorm(4), check.names = FALSE)
> >
> >
> > foo
> ID TIME BUY-A BUY-B SELL-A SELL-B
> 1 1 1 0.47022807 1.09573107 0.1977035 -0.08333043
> 2 2 2 -0.20672870 0.07397772 1.4959044 -0.98555020
> 3 3 3 0.05533779 0.25821758 1.3531913 0.16808307
> 4 4 4 -0.11471772 1.27798740 -0.1101390 -0.36937994
> >
> > reshape(foo, direction="long",
> + varying = list(c("BUY-A", "BUY-B"), c("SELL-A", "SELL-B")),
> + v.names=c("BUY", "SELL"), idvar="ID",
> + times = c("A", "B"), timevar="which")
> ID TIME which BUY SELL
> 1.A 1 1 A 0.47022807 0.19770349
> 2.A 2 2 A -0.20672870 1.49590443
> 3.A 3 3 A 0.05533779 1.35319133
> 4.A 4 4 A -0.11471772 -0.11013896
> 1.B 1 1 B 1.09573107 -0.08333043
> 2.B 2 2 B 0.07397772 -0.98555020
> 3.B 3 3 B 0.25821758 0.16808307
> 4.B 4 4 B 1.27798740 -0.36937994
It's a little more verbose with the reshape package, but I find it
easier to understand what's going on.
fm <- melt(foo, id=c("ID","TIME"))
fm <- cbind(fm, colsplit(fm$variable, "-", c("direction","type")))
fm$variable <- NULL
cast(fm, ... ~ direction)
There's an example like this in the introduction to reshape manual.
Hadley
More information about the R-help
mailing list