[R] Wide to long form conversion
Gang Chen
gangchen6 at gmail.com
Sat Oct 8 02:31:59 CEST 2011
David, thanks a lot for the code! I've learned quite a bit from all
the generous help...
Gang
On Fri, Oct 7, 2011 at 1:37 PM, David Winsemius <dwinsemius at comcast.net> wrote:
>
> On Oct 7, 2011, at 1:30 PM, David Winsemius wrote:
>
>>
>> On Oct 7, 2011, at 7:40 AM, Gang Chen wrote:
>>
>>> Jim, I really appreciate your help!
>>>
>>> I like the power of rep_n_stack, but how can I use rep_n_stack to get
>>> the following result?
>>>
>>> Subj Group value Ref Var Time
>>> 1 S1 s 4 Me F 1
>>> 2 S1 s 3 Me F 2
>>> 3 S1 s 5 Me J 1
>>> 4 S1 s 6 Me J 2
>>> 5 S1 s 6 She F 1
>>> 6 S1 s 6 She F 2
>>> 7 S1 s 10 She J 1
>>> 8 S1 s 9 She J 2
>>
>> I was not able to construct a one step solution with `reshape` that will
>> contains all the columns. You can do it in about 4 steps by first making the
>> data "long" and then adding annotation columns. Using just rows 1 and 26 you
>> might get:
>>
>> reshape(myData[c(1,26), ], idvar=c("Group","Subj"),
>> direction="long",
>> varying=2:9,
>> v.names=c("value") )
>> Group Subj time value
>> s.S1.1 s S1 1 4
>> w.S26.1 w S26 1 5
>> s.S1.2 s S1 2 5
>> w.S26.2 w S26 2 9
>> s.S1.3 s S1 3 6
>> w.S26.3 w S26 3 4
>> s.S1.4 s S1 4 10
>> w.S26.4 w S26 4 7
>> s.S1.5 s S1 5 3
>> w.S26.5 w S26 5 3
>> s.S1.6 s S1 6 6
>> w.S26.6 w S26 6 7
>> s.S1.7 s S1 7 6
>> w.S26.7 w S26 7 3
>> s.S1.8 s S1 8 9
>> w.S26.8 w S26 8 5
>>
>> The 'time' variable is not really what you wanted but refers to the
>> sequence along the original wide column names
>> You can add the desired Ref, Var and Time columms with these
>> constructions:
>>
>> > str(times<-rep(c(1,2), length=nrow(myData)*8 ) )
>> num [1:408] 1 2 1 2 1 2 1 2 1 2 ...
>> > str(times<-rep(c("F","J"), each=2, length=nrow(myData)*8 ) )
>> chr [1:408] "F" "F" "J" "J" "F" "F" "J" "J" "F" "F" ...
>> > str(times<-rep(c("Me","She"), each=4, length=nrow(myData)*8 ) )
>> chr [1:408] "Me" "Me" "Me" "Me" "She" "She" "She" "She" ...
>>
> It occured to me that the ordering operation probably should have preceded
> teh ancillary column creation so this method is tested:
>
>> longData <- reshape(myData, idvar=c("Group","Subj"),
>> direction="long", #fixed the direction argument
>> varying=2:9,
>> v.names=c("value") )
>> longData <- longData[order(longData$Subj), ]
>> longData$Time <- rep(c(1,2), length=nrow(myData)*8 )
>> longData$Var <- rep(c("F","J"), each=2, length=nrow(myData)*8 )
>> longData$Ref <- rep(c("Me","She"), each=4, length=nrow(myData)*8 )
>>
> Group Subj time value Time Var Ref
> s.S1.1 s S1 1 4 1 F Me
> s.S1.2 s S1 2 5 2 F Me
> s.S1.3 s S1 3 6 1 J Me
> s.S1.4 s S1 4 10 2 J Me
> s.S1.5 s S1 5 3 1 F She
> s.S1.6 s S1 6 6 2 F She
> s.S1.7 s S1 7 6 1 J She
> s.S1.8 s S1 8 9 2 J She
>
>
>>
>> Looking at Jim Lemon's response, I think he just misinterpreted the
>> structure of your data but gave you a perfectly usable response. You could
>> have done much the same thing with a minor modification:
>>
>> > str(rep_n_stack(myData,matrix(c(2,3,6,7,4,5,8,9),nrow=1,byrow=TRUE)))
>> 'data.frame': 408 obs. of 4 variables:
>> $ Group : Factor w/ 2 levels "s","w": 1 1 1 1 1 1 1 1 1 1 ...
>> $ Subj : Factor w/ 51 levels "S1","S10","S11",..: 1 12 23 34 45 48 49 50
>> 51 2 ...
>> $ group1: Factor w/ 8 levels "Me.F.1","Me.F.2",..: 1 1 1 1 1 1 1 1 1 1 ...
>> $ value1: int 4 6 7 8 10 5 13 8 6 14 ...
>>
>> Now you can just split apart the 'group1' column with sub() to make the
>> three specified columns.
>
> Lemon's method has the advantage that it properly carries along the column
> information
>
>> --
>> David.
>>
>>>
>>> On Fri, Oct 7, 2011 at 7:16 AM, Jim Lemon <jim at bitwrit.com.au> wrote:
>>>>
>>>> On 10/07/2011 07:28 AM, Gang Chen wrote:
>>>>>
>>>>> I have some data 'myData' in wide form (attached at the end), and
>>>>> would like to convert it to long form. I wish to have five variables
>>>>> in the result:
>>>>>
>>>>> 1) Subj: factor
>>>>> 2) Group: between-subjects factor (2 levels: s / w)
>>>>> 3) Reference: within-subject factor (2 levels: Me / She)
>>>>> 4) F: within-subject factor (2 levels: F1 / F2)
>>>>> 5) J: within-subject factor (2 levels: J1 / J2)
>>>>
>>>> Hi Gang,
>>>> I don't know whether this is the format you want, but:
>>>>
>>>> library(prettyR)
>>>> rep_n_stack(mydata,matrix(c(2,3,6,7,4,5,8,9),nrow=2,byrow=TRUE))
>>>>
>>>> Jim
>>>>
>>>
>
> David Winsemius, MD
> West Hartford, CT
>
>
More information about the R-help
mailing list