[R] reshape (a better way)
Jean Eid
jeaneid at chass.utoronto.ca
Wed Jan 26 22:17:52 CET 2005
thank you Thomas and Chuck for the helpfull codes. My problem was with the
times variable adn chuck's example made that really clear. I just have one
more question regarding reshape. when varying are not the same length
reshape complains, so I generated the name of the variables that are
missing and it complained about "undefined columns selected". So the way
around it that I did is that i generated an NA vector in the wide format
for every one of these variables that are missing and it did it
successfully.
Is there a way to do this without generating these vectors. e.g. Here's an
example
XX <- data.frame(one1=rnorm(10), one2=rnorm(10), one3=rnorm(10),
two1=rnorm(10), two2=rnorm(10))
reshape(XX, direction="long", varying=list(c("one1", "one2", "one3"),
c("two1", "two2")), v.names=c("one", "two"), times=c("1", "2", "3"))
## gives Error in reshapeLong(data, idvar = idvar, timevar = timevar,
## varying = varying, :
## 'varying' arguments must be the same length
reshape(XX, direction="long", varying=list(c("one1", "one2", "one3"),
c("two1", "two2", "two3")), v.names=c("one", "two"), times=c("1", "2", "3"))
## Error in "[.data.frame"(data, , varying[[j]][i]) :
## undefined columns selected
#and finally
XX$two3<-NA
reshape(XX, direction="long", varying=list(c("one1", "one2", "one3"),
c("two1", "two2", "two3")), v.names=c("one", "two"), times=c("1", "2",
"3"))
# I get the correct output.
I there a way not to generate the XX$two3 above
Thank you
Jean
On Wed, 26 Jan 2005, Thomas Lumley wrote:
> On Wed, 26 Jan 2005, Jean Eid wrote:
>
> >
> > Hi,
> >
> > I am using the NLSY79 data (longitudinal data from the Bureau of labour
> > stats in the US). The extractor exctracts this data in a "wide" format and
> > I need to reshape it into a long format.
> >
> > What I am doing right now is to do it in chuncks for each and evry
> > variable that is varying and then I merge the data together. This is
> > taking a long time. my question is:
> >
> > How do I specify that there are multiple variables that are varying in
> > reshape. Is there a way to do this?
>
> Yes. The help page says
>
> varying: names of sets of variables in the wide format that correspond
> to single variables in long format ('time-varying'). A list
> of vectors (or optionally a matrix for 'direction="wide"').
> See below for more details and options.
>
> That is, you can use something like
>
> varying=list(c("foo1","foo2","foo3"),
> c("bar1","bar2","bar3"),
> c("baz1","baz2","baz3")),
> v.names=c("foo","bar","baz")
>
>
> -thomas
>
>
More information about the R-help
mailing list