[R] Wide to long form conversion
Dennis Murphy
djmuser at gmail.com
Fri Oct 7 03:09:47 CEST 2011
Hi:
> I have some data 'myData' in wide form (attached at the end), and
> would like to convert it to long form. I wish to have five variables
> in the result:
>
> 1) Subj: factor
> 2) Group: between-subjects factor (2 levels: s / w)
> 3) Reference: within-subject factor (2 levels: Me / She)
> 4) F: within-subject factor (2 levels: F1 / F2)
> 5) J: within-subject factor (2 levels: J1 / J2)
I don't see how you can get all of 3-5 given the way your data is
structured. The problem is that each column contains two of the three
variables you want, but not all three. I can see a way to get
Subj Group Ref Time F J
S1 s Me 1 4 5
S1 s Me 2 3 6
S1 s She 1 6 10
S1 s She 2 6 9
or an 8 line version with Ref (4 Me, 4 She), Factor (F1, J1, F2, J2)
repeated twice and the appropriate response vector, but not a way
where you have three columns for Ref, F and J. For example, what is
the 'J' for MeF1 or the F for SheJ2?
With that, here are a few stabs using the reshape2 package. The first
step is to do a little renaming of your data frame so that one can use
the colsplit() function to generate a new set of variables.
names(myData) <- c("Group", "Me_F_1", "Me_J_1", "She_F_1", "She_J_1",
"Me_F_2", "Me_J_2", "She_F_2", "She_J_2", "Subj")
library('plyr')
library('reshape2')
# collapses the eight columns to be reshaped into a factor named
# variable with a corresponding variable named value
mData <- melt(myData, id = c('Subj', 'Group'))
head(mData)
# Split the original variables into three new columns, named
# Ref, Var and Time, respectively:
newvars <- colsplit(mData$variable, '_', c('Ref', 'Var', 'Time'))
# Append these to the melted data frame and remove 'variable'
mData2 <- cbind(mData, newvars)[, -3]
# This comes closest to your original intent:
mData3 <- arrange(mData2, Subj, Ref, Var, Time)
head(mData3, 8)
Subj Group value Ref Var Time
1 S1 s 4 Me F 1
2 S1 s 3 Me F 2
3 S1 s 5 Me J 1
4 S1 s 6 Me J 2
5 S1 s 6 She F 1
6 S1 s 6 She F 2
7 S1 s 10 She J 1
8 S1 s 9 She J 2
# Some rearrangements to consider:
mData4 <- cast(mData3, Subj + Group + Ref + Time ~ Var, value_var = 'value')
head(mData4, 4)
Subj Group Ref Time F J
1 S1 s Me 1 4 5
2 S1 s Me 2 3 6
3 S1 s She 1 6 10
4 S1 s She 2 6 9
mData5 <- cast(mData3, Subj + Group + Ref + Var ~ Time, value_var = 'value')
head(mData5, 4)
Subj Group Ref Var 1 2
1 S1 s Me F 4 3
2 S1 s Me J 5 6
3 S1 s She F 6 6
4 S1 s She J 10 9
If you like this one, it's probably a good idea to rename the last two
columns 'Time1' and 'Time2' or something similar.
HTH,
Dennis
On Thu, Oct 6, 2011 at 1:28 PM, Gang Chen <gangchen6 at gmail.com> wrote:
> I have some data 'myData' in wide form (attached at the end), and
> would like to convert it to long form. I wish to have five variables
> in the result:
>
> 1) Subj: factor
> 2) Group: between-subjects factor (2 levels: s / w)
> 3) Reference: within-subject factor (2 levels: Me / She)
> 4) F: within-subject factor (2 levels: F1 / F2)
> 5) J: within-subject factor (2 levels: J1 / J2)
>
> As this is the first time I'm learning such a conversion, could
> someone help me out?
>
> Many thanks,
> Gang
>
>> myData
>
> Group MeF1 MeJ1 SheF1 SheJ1 MeF2 MeJ2 SheF2 SheJ2 Subj
> 1 s 4 5 6 10 3 6 6 9 S1
> 2 s 6 5 5 6 4 3 5 6 S2
> 3 s 7 4 6 5 7 4 5 3 S3
> 4 s 8 5 8 7 7 1 8 6 S4
> 5 s 10 6 4 7 9 6 4 6 S5
> 6 s 5 2 4 7 4 1 4 2 S6
> 7 s 13 2 10 4 11 2 4 3 S7
> 8 s 8 1 3 11 6 0 3 10 S8
> 9 s 6 9 5 8 6 8 5 6 S9
> 10 s 14 5 6 10 13 5 5 10 S10
> 11 s 15 2 18 2 14 1 18 2 S11
> 12 s 6 9 4 9 5 11 3 8 S12
> 13 s 5 5 0 12 4 3 0 8 S13
> 14 s 5 6 4 9 4 6 2 6 S14
> 15 s 14 5 12 3 12 3 11 3 S15
> 16 s 7 2 11 3 5 2 10 2 S16
> 17 s 1 7 4 5 1 6 3 5 S17
> 18 s 6 2 7 4 6 2 7 4 S18
> 19 s 9 4 8 5 10 4 6 3 S19
> 20 s 8 2 6 5 9 2 6 4 S20
> 21 s 6 5 5 7 6 6 5 5 S21
> 22 s 8 8 3 7 6 7 5 3 S22
> 23 s 11 4 6 7 1 1 6 4 S23
> 24 s 6 3 2 4 6 4 2 2 S24
> 25 s 4 4 6 6 2 3 4 6 S25
> 26 w 5 9 4 7 3 7 3 5 S26
> 27 w 7 6 3 5 4 1 0 4 S27
> 28 w 10 4 14 2 8 4 10 2 S28
> 29 w 9 7 5 6 8 4 5 3 S29
> 30 w 9 2 7 5 6 2 6 5 S30
> 31 w 6 7 6 7 6 5 5 8 S31
> 32 w 7 6 12 7 6 3 10 7 S32
> 33 w 12 3 8 9 11 3 4 7 S33
> 34 w 12 2 10 5 9 2 6 3 S34
> 35 w 6 3 10 4 5 3 5 3 S35
> 36 w 9 3 9 9 6 3 7 8 S36
> 37 w 5 11 7 7 4 11 3 4 S37
> 38 w 7 4 4 6 7 3 1 5 S38
> 39 w 6 5 1 8 3 3 0 8 S39
> 40 w 10 3 10 2 7 3 7 2 S40
> 41 w 1 11 7 5 1 8 4 3 S41
> 42 w 10 5 6 10 10 4 3 9 S42
> 43 w 6 3 9 2 4 2 6 0 S43
> 44 w 9 5 11 4 5 4 7 3 S44
> 45 w 8 5 6 3 8 4 2 3 S45
> 46 w 8 4 8 7 4 1 2 6 S46
> 47 w 12 2 6 2 10 1 5 2 S47
> 48 w 10 6 9 8 7 5 7 8 S48
> 49 w 13 6 15 1 12 4 14 0 S49
> 50 w 7 8 1 12 4 7 1 11 S50
> 51 w 12 3 9 4 9 1 7 4 S51
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
More information about the R-help
mailing list