[R] help with colsplit (reshape)
Ista Zahn
istazahn at gmail.com
Fri Jun 13 17:46:06 CEST 2008
Dear list,
I'm trying to figure out how to use the reshape package to reshape
data from a "wide" format to a "long" format. I have data like this
pid <- c(1:10)
predA <- c(-1,-2,-1,-2,-1,-2,-1,-2,-1,-2)
predB.1 <- c(0,0,0,1,1,0,0,0,1,1)
predB.2 <- c(2,2,3,3,3,2,2,3,3,3)
predC.1 <- c(10,10,10,10,10,11,11,11,11,11)
predC.2 <- c(12,12,13,13,13,12,12,13,13,13)
out.1 <- c(100:109)
out.2 <- c(200:209)
Data <- data.frame(pid, predA, predB.1, predB.2, predC.1, predC.2, out.
1, out.2)
and I want to make it look like this:
head(L.Data <- reshape(Data, varying = list(3:4, 5:6, 7:8),
idvar="pid", v.names=c("PredA", "PredB", "Out"),
timevar="measure.num", times=c(1,2), direction="long"))
pid predA measure.num PredA PredB Out
1.1 1 -1 1 0 10 100
2.1 2 -2 1 0 10 101
3.1 3 -1 1 0 10 102
4.1 4 -2 1 1 10 103
5.1 5 -1 1 1 10 104
6.1 6 -2 1 0 11 105
Using Hadley's JSS article "Reshaping Data with the reshape Package"
as a guide, I tried the following:
M.Data <- melt(Data, id="pid")
M.Data2 <- cbind(M.Data, colsplit(M.Data$variable, split = ".", names
= c("treatment", "time")))
but this gave a warning and resulted in
head(M.Data2)
pid variable value treatment time NA. NA..1 NA..2 NA..3 NA..4
1 1 predA -1 NA NA NA NA NA NA NA
2 2 predA -2 NA NA NA NA NA NA NA
3 3 predA -1 NA NA NA NA NA NA NA
4 4 predA -2 NA NA NA NA NA NA NA
5 5 predA -1 NA NA NA NA NA NA NA
6 6 predA -2 NA NA NA NA NA NA NA
I searched the mailing list and found this post: http://tolstoy.newcastle.edu.au/R/e4/help/08/05/11857.html
which led me to try
M.Data2 <- data.frame(M.Data, colsplit(M.Data$variable, split = "\\.",
names = c("treatment", "time")))
which gave:
head(M.Data2)
pid variable value treatment time
1 1 predA -1 predA predA
2 2 predA -2 predA predA
3 3 predA -1 predA predA
4 4 predA -2 predA predA
5 5 predA -1 predA predA
6 6 predA -2 predA predA
Closer but no cigar.
I would be grateful if someone will tell me (a) how to reshape the
data as described above using the reshape package, (b) what difference
between split = "." and split = "\\." is, and (c) if more information
about the colsplit command is available anywhere.
Thank you very much in advance,
Ista
More information about the R-help
mailing list