[R] reshaping some data

Tue Sep 14 19:56:19 CEST 2004

Try this:

is.x <- substr(colnames(x),1,1) == "x"   # TRUE if col name starts with x
x. <- unlist(rep(x[,is.x], diff(which(c(is.x,TRUE)))-1))   # repeat x cols
names(x.) <- NULL
y. <- unlist(x[,!is.x])
DF <- data.frame(x = x., y = y., row.names = NULL)

Sundar Dorai-Raj <sundar.dorai-raj <at> PDF.COM> writes:

: 
: Hi all,
:    I have a data.frame with the following colnames pattern:
: 
: x1 y11 x2 y21 y22 y23 x3 y31 y32 ...
: 
: I.e. I have an x followed by a few y's. What I would like to do is turn 
: this wide format into a tall format with two columns: "x", "y". The 
: structure is that xi needs to be associated with yij (e.g. x1 should 
: next to y11 and y12, x2 should be next to y21, y22, and y23, etc.).
: 
:   x   y
: x1 y11
: x2 y21
: x2 y22
: x2 y23
: x3 y31
: x3 y32
: ...
: 
: I have looked at ?reshape but I didn't see how it could work with this 
: structure. I have a solution using nested for loops (see below), but 
: it's slow and not very efficient. I would like to find a vectorised 
: solution that would achieve the same thing.
: 
: Now, for an example:
: 
: x <- data.frame(x1 =  1: 5, y11 =  1: 5,
:                  x2 =  6:10, y21 =  6:10, y22 = 11:15,
:                  x3 = 11:15, y31 = 16:20,
:                  x4 = 16:20, y41 = 21:25, y42 = 26:30, y43 = 31:35)
: # which are the x columns
: nmx <- grep("^x", names(x))
: # which are the y columns
: nmy <- grep("^y", names(x))
: # grab y values
: y <- unlist(x[nmy])
: # reserve some space for the x's
: z <- vector("numeric", length(y))
: # a loop counter
: k <- 0
: n <- nrow(x)
: seq.n <- seq(n)
: # determine how many times to repeat the x's
: repy <- diff(c(nmx, length(names(x)) + 1)) - 1
: for(i in seq(along = nmx)) {
:    for(j in seq(repy[i])) {
:      # store the x values in the appropriate z indices
:      z[seq.n + k * n] <- x[, nmx[i]]
:      # move to next block in z
:      k <- k + 1
:    }
: }
: data.frame(x = z, y = y, row.names = NULL)