[R] Wide to long form conversion

Dennis Murphy djmuser at gmail.com
Fri Oct 7 03:09:47 CEST 2011


Hi:

> I have some data 'myData' in wide form (attached at the end), and
> would like to convert it to long form. I wish to have five variables
> in the result:
>
> 1) Subj: factor
> 2) Group: between-subjects factor (2 levels: s / w)
> 3) Reference: within-subject factor (2 levels: Me / She)
> 4) F: within-subject factor (2 levels: F1 / F2)
> 5) J: within-subject factor (2 levels: J1 / J2)

I don't see how you can get all of 3-5 given the way your data is
structured. The problem is that each column contains two of the three
variables you want, but not all three. I can see a way to get

Subj   Group  Ref   Time  F   J
  S1       s      Me     1     4   5
  S1       s      Me     2     3   6
  S1       s     She    1     6  10
  S1       s     She    2     6   9

or an 8 line version with Ref (4 Me, 4 She), Factor (F1, J1, F2, J2)
repeated twice and the appropriate response vector, but not a way
where you have three columns for Ref, F and J. For example, what is
the 'J' for MeF1 or the F for SheJ2?

With that, here are a few stabs using the reshape2 package. The first
step is to do a little renaming of your data frame so that one can use
the colsplit() function to generate a new set of variables.

names(myData) <- c("Group", "Me_F_1",  "Me_J_1",  "She_F_1", "She_J_1",
                   "Me_F_2",  "Me_J_2",  "She_F_2", "She_J_2", "Subj")

library('plyr')
library('reshape2')

# collapses the eight columns to be reshaped into a factor named
# variable with a corresponding variable named value
mData <- melt(myData, id = c('Subj', 'Group'))
head(mData)

# Split the original variables into three new columns, named
# Ref, Var and Time, respectively:
newvars <- colsplit(mData$variable, '_', c('Ref', 'Var', 'Time'))

# Append these to the melted data frame and remove 'variable'
mData2 <- cbind(mData, newvars)[, -3]

# This comes closest to your original intent:
mData3 <- arrange(mData2, Subj, Ref, Var, Time)
head(mData3, 8)

   Subj Group value Ref Var Time
1    S1     s     4  Me   F    1
2    S1     s     3  Me   F    2
3    S1     s     5  Me   J    1
4    S1     s     6  Me   J    2
5    S1     s     6 She   F    1
6    S1     s     6 She   F    2
7    S1     s    10 She   J    1
8    S1     s     9 She   J    2

# Some rearrangements to consider:
mData4 <- cast(mData3, Subj + Group + Ref + Time ~ Var, value_var = 'value')
head(mData4, 4)

  Subj Group Ref Time  F  J
1   S1     s  Me    1  4  5
2   S1     s  Me    2  3  6
3   S1     s She    1  6 10
4   S1     s She    2  6  9

mData5 <- cast(mData3, Subj + Group + Ref + Var ~ Time, value_var = 'value')
head(mData5, 4)

  Subj Group Ref Var  1  2
1   S1     s  Me   F  4  3
2   S1     s  Me   J  5  6
3   S1     s She   F  6  6
4   S1     s She   J 10  9

If you like this one, it's probably a good idea to rename the last two
columns 'Time1' and 'Time2' or something similar.

HTH,
Dennis

On Thu, Oct 6, 2011 at 1:28 PM, Gang Chen <gangchen6 at gmail.com> wrote:
> I have some data 'myData' in wide form (attached at the end), and
> would like to convert it to long form. I wish to have five variables
> in the result:
>
> 1) Subj: factor
> 2) Group: between-subjects factor (2 levels: s / w)
> 3) Reference: within-subject factor (2 levels: Me / She)
> 4) F: within-subject factor (2 levels: F1 / F2)
> 5) J: within-subject factor (2 levels: J1 / J2)
>
> As this is the first time I'm learning such a conversion, could
> someone help me out?
>
> Many thanks,
> Gang
>
>> myData
>
>   Group MeF1 MeJ1 SheF1 SheJ1 MeF2 MeJ2 SheF2 SheJ2 Subj
> 1      s    4    5     6    10    3    6     6     9   S1
> 2      s    6    5     5     6    4    3     5     6   S2
> 3      s    7    4     6     5    7    4     5     3   S3
> 4      s    8    5     8     7    7    1     8     6   S4
> 5      s   10    6     4     7    9    6     4     6   S5
> 6      s    5    2     4     7    4    1     4     2   S6
> 7      s   13    2    10     4   11    2     4     3   S7
> 8      s    8    1     3    11    6    0     3    10   S8
> 9      s    6    9     5     8    6    8     5     6   S9
> 10     s   14    5     6    10   13    5     5    10  S10
> 11     s   15    2    18     2   14    1    18     2  S11
> 12     s    6    9     4     9    5   11     3     8  S12
> 13     s    5    5     0    12    4    3     0     8  S13
> 14     s    5    6     4     9    4    6     2     6  S14
> 15     s   14    5    12     3   12    3    11     3  S15
> 16     s    7    2    11     3    5    2    10     2  S16
> 17     s    1    7     4     5    1    6     3     5  S17
> 18     s    6    2     7     4    6    2     7     4  S18
> 19     s    9    4     8     5   10    4     6     3  S19
> 20     s    8    2     6     5    9    2     6     4  S20
> 21     s    6    5     5     7    6    6     5     5  S21
> 22     s    8    8     3     7    6    7     5     3  S22
> 23     s   11    4     6     7    1    1     6     4  S23
> 24     s    6    3     2     4    6    4     2     2  S24
> 25     s    4    4     6     6    2    3     4     6  S25
> 26     w    5    9     4     7    3    7     3     5  S26
> 27     w    7    6     3     5    4    1     0     4  S27
> 28     w   10    4    14     2    8    4    10     2  S28
> 29     w    9    7     5     6    8    4     5     3  S29
> 30     w    9    2     7     5    6    2     6     5  S30
> 31     w    6    7     6     7    6    5     5     8  S31
> 32     w    7    6    12     7    6    3    10     7  S32
> 33     w   12    3     8     9   11    3     4     7  S33
> 34     w   12    2    10     5    9    2     6     3  S34
> 35     w    6    3    10     4    5    3     5     3  S35
> 36     w    9    3     9     9    6    3     7     8  S36
> 37     w    5   11     7     7    4   11     3     4  S37
> 38     w    7    4     4     6    7    3     1     5  S38
> 39     w    6    5     1     8    3    3     0     8  S39
> 40     w   10    3    10     2    7    3     7     2  S40
> 41     w    1   11     7     5    1    8     4     3  S41
> 42     w   10    5     6    10   10    4     3     9  S42
> 43     w    6    3     9     2    4    2     6     0  S43
> 44     w    9    5    11     4    5    4     7     3  S44
> 45     w    8    5     6     3    8    4     2     3  S45
> 46     w    8    4     8     7    4    1     2     6  S46
> 47     w   12    2     6     2   10    1     5     2  S47
> 48     w   10    6     9     8    7    5     7     8  S48
> 49     w   13    6    15     1   12    4    14     0  S49
> 50     w    7    8     1    12    4    7     1    11  S50
> 51     w   12    3     9     4    9    1     7     4  S51
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list