[R] Wide to long form conversion
David Winsemius
dwinsemius at comcast.net
Fri Oct 7 19:37:53 CEST 2011
On Oct 7, 2011, at 1:30 PM, David Winsemius wrote:
>
> On Oct 7, 2011, at 7:40 AM, Gang Chen wrote:
>
>> Jim, I really appreciate your help!
>>
>> I like the power of rep_n_stack, but how can I use rep_n_stack to get
>> the following result?
>>
>> Subj Group value Ref Var Time
>> 1 S1 s 4 Me F 1
>> 2 S1 s 3 Me F 2
>> 3 S1 s 5 Me J 1
>> 4 S1 s 6 Me J 2
>> 5 S1 s 6 She F 1
>> 6 S1 s 6 She F 2
>> 7 S1 s 10 She J 1
>> 8 S1 s 9 She J 2
>
> I was not able to construct a one step solution with `reshape` that
> will contains all the columns. You can do it in about 4 steps by
> first making the data "long" and then adding annotation columns.
> Using just rows 1 and 26 you might get:
>
> reshape(myData[c(1,26), ], idvar=c("Group","Subj"),
> direction="long",
> varying=2:9,
> v.names=c("value") )
> Group Subj time value
> s.S1.1 s S1 1 4
> w.S26.1 w S26 1 5
> s.S1.2 s S1 2 5
> w.S26.2 w S26 2 9
> s.S1.3 s S1 3 6
> w.S26.3 w S26 3 4
> s.S1.4 s S1 4 10
> w.S26.4 w S26 4 7
> s.S1.5 s S1 5 3
> w.S26.5 w S26 5 3
> s.S1.6 s S1 6 6
> w.S26.6 w S26 6 7
> s.S1.7 s S1 7 6
> w.S26.7 w S26 7 3
> s.S1.8 s S1 8 9
> w.S26.8 w S26 8 5
>
> The 'time' variable is not really what you wanted but refers to the
> sequence along the original wide column names
> You can add the desired Ref, Var and Time columms with these
> constructions:
>
> > str(times<-rep(c(1,2), length=nrow(myData)*8 ) )
> num [1:408] 1 2 1 2 1 2 1 2 1 2 ...
> > str(times<-rep(c("F","J"), each=2, length=nrow(myData)*8 ) )
> chr [1:408] "F" "F" "J" "J" "F" "F" "J" "J" "F" "F" ...
> > str(times<-rep(c("Me","She"), each=4, length=nrow(myData)*8 ) )
> chr [1:408] "Me" "Me" "Me" "Me" "She" "She" "She" "She" ...
>
It occured to me that the ordering operation probably should have
preceded teh ancillary column creation so this method is tested:
> longData <- reshape(myData, idvar=c("Group","Subj"),
> direction="long", #fixed the direction argument
> varying=2:9,
> v.names=c("value") )
> longData <- longData[order(longData$Subj), ]
> longData$Time <- rep(c(1,2), length=nrow(myData)*8 )
> longData$Var <- rep(c("F","J"), each=2, length=nrow(myData)*8 )
> longData$Ref <- rep(c("Me","She"), each=4, length=nrow(myData)*8 )
>
Group Subj time value Time Var Ref
s.S1.1 s S1 1 4 1 F Me
s.S1.2 s S1 2 5 2 F Me
s.S1.3 s S1 3 6 1 J Me
s.S1.4 s S1 4 10 2 J Me
s.S1.5 s S1 5 3 1 F She
s.S1.6 s S1 6 6 2 F She
s.S1.7 s S1 7 6 1 J She
s.S1.8 s S1 8 9 2 J She
>
> Looking at Jim Lemon's response, I think he just misinterpreted the
> structure of your data but gave you a perfectly usable response. You
> could have done much the same thing with a minor modification:
>
> >
> str(rep_n_stack(myData,matrix(c(2,3,6,7,4,5,8,9),nrow=1,byrow=TRUE)))
> 'data.frame': 408 obs. of 4 variables:
> $ Group : Factor w/ 2 levels "s","w": 1 1 1 1 1 1 1 1 1 1 ...
> $ Subj : Factor w/ 51 levels "S1","S10","S11",..: 1 12 23 34 45 48
> 49 50 51 2 ...
> $ group1: Factor w/ 8 levels "Me.F.1","Me.F.2",..: 1 1 1 1 1 1 1 1 1
> 1 ...
> $ value1: int 4 6 7 8 10 5 13 8 6 14 ...
>
> Now you can just split apart the 'group1' column with sub() to make
> the three specified columns.
Lemon's method has the advantage that it properly carries along the
column information
> --
> David.
>
>>
>> On Fri, Oct 7, 2011 at 7:16 AM, Jim Lemon <jim at bitwrit.com.au> wrote:
>>> On 10/07/2011 07:28 AM, Gang Chen wrote:
>>>>
>>>> I have some data 'myData' in wide form (attached at the end), and
>>>> would like to convert it to long form. I wish to have five
>>>> variables
>>>> in the result:
>>>>
>>>> 1) Subj: factor
>>>> 2) Group: between-subjects factor (2 levels: s / w)
>>>> 3) Reference: within-subject factor (2 levels: Me / She)
>>>> 4) F: within-subject factor (2 levels: F1 / F2)
>>>> 5) J: within-subject factor (2 levels: J1 / J2)
>>>
>>> Hi Gang,
>>> I don't know whether this is the format you want, but:
>>>
>>> library(prettyR)
>>> rep_n_stack(mydata,matrix(c(2,3,6,7,4,5,8,9),nrow=2,byrow=TRUE))
>>>
>>> Jim
>>>
>>
David Winsemius, MD
West Hartford, CT
More information about the R-help
mailing list