[R] help with reshaping data into long format (correct question)
Henrique Dallazuanna
wwwhsd at gmail.com
Wed Jan 16 13:44:33 CET 2008
try this:
x[6, which(x[5,]=="y")] <- "y"
levels(x$id) <- c(levels(x$id)[drop=T], "treat")
x <- x[-5,]
x[5, "id"] <- "treat"
levels(x$id) <- gsub("^ques", "", levels(x$id))
x3 <- as.data.frame(t(x[,-1]))
names(x3) <- x$id
foo <- function(x, ...)
{
tmp <- as.numeric(as.character(unlist(x[,grep("_", names(x), value=T)])))
y <- x[,c("disease", "age", "city", "sex", "treat")][rep(1,length(tmp)),]
newdf <- data.frame(y, quess=grep("_", names(x), value=T), value=tmp)
return(newdf)
}
do.call(rbind, lapply(x4, foo))
On 15/01/2008, Tom Cohen <tom.cohen78 at yahoo.se> wrote:
>
> Dear list,
> I have the following data set
>
> id 1 2 3 4 5 6 7 8 9 10
> disease a b c d e f g h i j
> age 23 40 32 34 25 32 22 35 29 21
> city NY LD NY SG NY LD VG SA LD SG
> sex 1 1 2 2 2 2 1 1 1 2
> treat_a y y y y
> treat_b n n n n n n
> ques1_1 2 4 5 6 8 3 1 2 4 5
> ques1_2 6 4 5 12 10 9 8 4 5 7
> ques1_3 17 23 32 25 14 24 23 22 32 29
> ques2_1 4 7 9 10 6 8 5 7 8 9
> ques2_2 8 9 10 12 17 19 14 21 22 19
> ques2_3 23 18 19 20 23 24 26 28 29 22
> ques3_1 5 7 9 1 4 7 9 8 10 5
> ques3_2 34 35 32 23 31 29 27 25 32 33
> ques3_3 29 33 27 25 27 23 24 29 27 24
>
> where the first row is the header row in a dataframe. First I want to merge the two variables
> treat_a and treat_b to a new variable called "treat" which will be given n if it's left blank
> in the variable treat_a and y if it's left blank in treat_b. The new data set will look like
> id 1 2 3 4 5 6 7 8 9 10
> disease a b c d e f g h i j
> age 23 40 32 34 25 32 22 35 29 21
> city NY LD NY SG NY LD VG SA LD SG
> sex 1 1 2 2 2 2 1 1 1 2
> treat n n n y y y n n y n
> ques1_1 2 4 5 6 8 3 1 2 4 5
> ques1_2 6 4 5 12 10 9 8 4 5 7
> ques1_3 17 23 32 25 14 24 23 22 32 29
> ques2_1 4 7 9 10 6 8 5 7 8 9
> ques2_2 8 9 10 12 17 19 14 21 22 19
> ques2_3 23 18 19 20 23 24 26 28 29 22
> ques3_1 5 7 9 1 4 7 9 8 10 5
> ques3_2 34 35 32 23 31 29 27 25 32 33
> ques3_3 29 33 27 25 27 23 24 29 27 24
> Now I want to reshape the data in a long format with target output
>
> id disease age city sex treat ques ques_value
> 1 a 23 NY 1 n 1_1 2
> 1 a 23 NY 1 n 1_2 6
> 1 a 23 NY 1 n 1_3 17
> 1 a 23 NY 1 n 2_1 4
> 1 a 23 NY 1 n 2_2 8
> 1 a 23 NY 1 n 2_3 23
> 1 a 23 NY 1 n 3_1 5
> 1 a 23 NY 1 n 3_2 34
> 1 a 23 NY 1 n 3_3 29
> 2 b 40 LD 1 n 1 _1 4
> 2 b 40 LD 1 n 1 _2 4
> 2 b 40 LD 1 n 1 _3 23
> 2 b 40 LD 1 n 2_1 7
> 2 b 40 LD 1 n 2_2 9
> 2 b 40 LD 1 n 2_3 18
> 2 b 40 LD 1 n 3_1 7
> 2 b 40 LD 1 n 3_2 35
> 2 b 40 LD 1 n 3_3 33
> ..
> ..
> ..
> 10 j 21 SG 2 n 3_3 24
> How can I do this in R?
> Thanks alot for any help,
> Tom
>
>
> ---------------------------------
>
> Jämför pris på flygbiljetter och hotellrum: http://shopping.yahoo.se/c-169901-resor-biljetter.html
> [[alternative HTML version deleted]]
>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
--
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40" S 49° 16' 22" O
More information about the R-help
mailing list