[R] Wide to long form conversion

Gang Chen gangchen6 at gmail.com
Sat Oct 8 02:31:59 CEST 2011


David, thanks a lot for the code! I've learned quite a bit from all
the generous help...

Gang

On Fri, Oct 7, 2011 at 1:37 PM, David Winsemius <dwinsemius at comcast.net> wrote:
>
> On Oct 7, 2011, at 1:30 PM, David Winsemius wrote:
>
>>
>> On Oct 7, 2011, at 7:40 AM, Gang Chen wrote:
>>
>>> Jim, I really appreciate your help!
>>>
>>> I like the power of rep_n_stack, but how can I use rep_n_stack to get
>>> the following result?
>>>
>>> Subj Group value Ref Var Time
>>> 1    S1     s     4  Me   F    1
>>> 2    S1     s     3  Me   F    2
>>> 3    S1     s     5  Me   J    1
>>> 4    S1     s     6  Me   J    2
>>> 5    S1     s     6 She   F    1
>>> 6    S1     s     6 She   F    2
>>> 7    S1     s    10 She   J    1
>>> 8    S1     s     9 She   J    2
>>
>> I was not able to construct a one step solution with `reshape` that will
>> contains all the columns. You can do it in about 4 steps by first making the
>> data "long" and then adding annotation columns. Using just rows 1 and 26 you
>> might get:
>>
>> reshape(myData[c(1,26), ], idvar=c("Group","Subj"),
>>      direction="long",
>>      varying=2:9,
>>      v.names=c("value") )
>>       Group Subj time value
>> s.S1.1      s   S1    1      4
>> w.S26.1     w  S26    1      5
>> s.S1.2      s   S1    2      5
>> w.S26.2     w  S26    2      9
>> s.S1.3      s   S1    3      6
>> w.S26.3     w  S26    3      4
>> s.S1.4      s   S1    4     10
>> w.S26.4     w  S26    4      7
>> s.S1.5      s   S1    5      3
>> w.S26.5     w  S26    5      3
>> s.S1.6      s   S1    6      6
>> w.S26.6     w  S26    6      7
>> s.S1.7      s   S1    7      6
>> w.S26.7     w  S26    7      3
>> s.S1.8      s   S1    8      9
>> w.S26.8     w  S26    8      5
>>
>> The 'time' variable is not really what you wanted but refers to the
>> sequence along the original wide column names
>> You can add the desired  Ref, Var and Time columms with these
>> constructions:
>>
>> > str(times<-rep(c(1,2), length=nrow(myData)*8 )  )
>> num [1:408] 1 2 1 2 1 2 1 2 1 2 ...
>> > str(times<-rep(c("F","J"), each=2, length=nrow(myData)*8 )  )
>> chr [1:408] "F" "F" "J" "J" "F" "F" "J" "J" "F" "F" ...
>> > str(times<-rep(c("Me","She"), each=4, length=nrow(myData)*8 )  )
>> chr [1:408] "Me" "Me" "Me" "Me" "She" "She" "She" "She" ...
>>
> It occured to me that the ordering operation probably should have preceded
> teh ancillary column creation so this method is tested:
>
>> longData <- reshape(myData, idvar=c("Group","Subj"),
>>       direction="long",    #fixed the direction argument
>>      varying=2:9,
>>      v.names=c("value") )
>> longData <- longData[order(longData$Subj), ]
>> longData$Time <- rep(c(1,2), length=nrow(myData)*8 )
>> longData$Var <- rep(c("F","J"), each=2, length=nrow(myData)*8 )
>> longData$Ref <- rep(c("Me","She"), each=4, length=nrow(myData)*8 )
>>
>       Group Subj time value Time Var Ref
> s.S1.1     s   S1    1     4    1   F  Me
> s.S1.2     s   S1    2     5    2   F  Me
> s.S1.3     s   S1    3     6    1   J  Me
> s.S1.4     s   S1    4    10    2   J  Me
> s.S1.5     s   S1    5     3    1   F She
> s.S1.6     s   S1    6     6    2   F She
> s.S1.7     s   S1    7     6    1   J She
> s.S1.8     s   S1    8     9    2   J She
>
>
>>
>> Looking at Jim Lemon's response, I think he just misinterpreted the
>> structure of your data but gave you a perfectly usable response. You could
>> have done much the same thing with a minor modification:
>>
>> > str(rep_n_stack(myData,matrix(c(2,3,6,7,4,5,8,9),nrow=1,byrow=TRUE)))
>> 'data.frame':   408 obs. of  4 variables:
>> $ Group : Factor w/ 2 levels "s","w": 1 1 1 1 1 1 1 1 1 1 ...
>> $ Subj  : Factor w/ 51 levels "S1","S10","S11",..: 1 12 23 34 45 48 49 50
>> 51 2 ...
>> $ group1: Factor w/ 8 levels "Me.F.1","Me.F.2",..: 1 1 1 1 1 1 1 1 1 1 ...
>> $ value1: int  4 6 7 8 10 5 13 8 6 14 ...
>>
>> Now you can just split apart the 'group1' column with sub() to make the
>> three specified columns.
>
> Lemon's method has the advantage that it properly carries along the column
> information
>
>> --
>> David.
>>
>>>
>>> On Fri, Oct 7, 2011 at 7:16 AM, Jim Lemon <jim at bitwrit.com.au> wrote:
>>>>
>>>> On 10/07/2011 07:28 AM, Gang Chen wrote:
>>>>>
>>>>> I have some data 'myData' in wide form (attached at the end), and
>>>>> would like to convert it to long form. I wish to have five variables
>>>>> in the result:
>>>>>
>>>>> 1) Subj: factor
>>>>> 2) Group: between-subjects factor (2 levels: s / w)
>>>>> 3) Reference: within-subject factor (2 levels: Me / She)
>>>>> 4) F: within-subject factor (2 levels: F1 / F2)
>>>>> 5) J: within-subject factor (2 levels: J1 / J2)
>>>>
>>>> Hi Gang,
>>>> I don't know whether this is the format you want, but:
>>>>
>>>> library(prettyR)
>>>> rep_n_stack(mydata,matrix(c(2,3,6,7,4,5,8,9),nrow=2,byrow=TRUE))
>>>>
>>>> Jim
>>>>
>>>
>
> David Winsemius, MD
> West Hartford, CT
>
>



More information about the R-help mailing list