[R] reshaping data frame

Chuck Cleland ccleland at optonline.net
Wed Feb 20 20:03:40 CET 2008


On 2/20/2008 1:14 PM, ahimsa campos-arceiz wrote:
> Dear all,
> 
> I'm having a few problems trying to reshape a data frame. I tried with
> reshape{stats} and melt{reshape} but I was missing something. Any help is
> very welcome. Please find details below:
> 
> #################################
> # data in its original shape:
> 
> indiv <- rep(c("A","B"),c(10,10))
> level.1 <- rpois(20, lambda=3)
> covar.1 <- rlnorm(20, 3, 1)
> level.2 <- rpois(20, lambda=3)
> covar.2 <- rlnorm(20, 3, 1)
> my.dat <- data.frame(indiv,level.1,covar.1,level.2,covar.2)
> 
> # the values of level.1 and level.2 represent the number of cases for the
> particular
> # combination of indiv*level*covar value
> 
> # I would like to do two things:
> # 1. reshape to long reducing my.dat[,2:5] into two colums "factor" (levels=
> level.1 & level.2)
> # and the covariate
> # 2. create one new row for each case in level.1 and level.2
> 
> # the new reshaped data.frame would should look like this:
> 
> # indiv  factor    covar   case.id
> #   A   level.1   4.614105    1
> #   A   level.1   4.614105    2
> #   A   level.2  31.064405    1
> #   A   level.2  31.064405    2
> #   A   level.2  31.064405    3
> #   A   level.2  31.064405    4
> #   A   level.1  19.185784    1
> #   A   level.2  48.455929    1
> #   A   level.2  48.455929    2
> #   A   level.2  48.455929    3
> # etc...
> 
> #############################

   Maybe there is a better way, but this seems to do what you want:

#################################
# data in its original shape:

indiv <- rep(c("A","B"),c(10,10))
level.1 <- rpois(20, lambda=3)
covar.1 <- rlnorm(20, 3, 1)
level.2 <- rpois(20, lambda=3)
covar.2 <- rlnorm(20, 3, 1)
my.dat <- data.frame(indiv,level.1,covar.1,level.2,covar.2)

long <- reshape(my.dat, varying = list(c("level.1","level.2"),
                                        c("covar.1","covar.2")),
                         timevar="level", idvar="case.id",
                         v.names=c("ncases","covar"),
                         direction="long")

newdf <- with(long, data.frame(indiv = rep(  indiv, ncases),
                                level = rep(  level, ncases),
                                covar = rep(  covar, ncases),
                              case.id = rep(case.id, ncases)))

   The idea is to first reshape() and then rep() each variable ncases 
times.  You can then convert newdf$level into a factor if you like.

> Thank you very much!!
> 
> Ahimsa 

-- 
Chuck Cleland, Ph.D.
NDRI, Inc.
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894



More information about the R-help mailing list