[R] placing multiple rows in a single row

David Winsemius dwinsemius at comcast.net
Tue Jul 5 16:12:51 CEST 2011


On Jul 5, 2011, at 3:00 AM, Annemarie Verkerk wrote:

> Dear David,
>
> thanks so much, I was able to get it to work for my data! I don't  
> really understand yet how the function works, but it seems extremely  
> useful.

The melt operation creates a "long" data.frame, (which is what many  
plotting programs expect.)

The dcast function creates a "wide" dataframe in the form of variable  
on the LHS of the formal being ID variables... ones that appear as  
values in the first columns, while variables on the RHS of the formula  
become column names. Any variables not in the formula  (in this case  
the "value" variable of the melted df) become the interior entries of  
the new wide df.

-- 
David
>
> Thanks again!
> Annemarie
>
>>
snipped original question
>>
>> There is a reshape function in the stats package that nobody except  
>> Phil Spector seems to understand and then there is the reshape and  
>> reshape2 packages that everybody seems to get. (I don't understand  
>> why the classification variables are on the left-hand-side, though.  
>> Positionally it makes some sense, but logically it does not connect  
>> with how I understand the process.)
>>
>> require(reshape2)
>> # entered your data with default names V1 V2 V3 V4 V5
>> > nam123
>>      V1 V2 V3 V4 V5
>> 1   John A1  1  0  1
>>
> snipped
>>
>> > nams.mlt <- melt(nam123, idvars=c("V1", "V2"))
>>
>> > str(nams.mlt)
>> 'data.frame':    36 obs. of  4 variables:
>> $ V1      : Factor w/ 4 levels "John","Josh",..: 1 1 1 3 3 3 4 4 4  
>> 2 ...
>> $ V2      : Factor w/ 3 levels "A1","A2","A3": 1 2 3 1 2 3 1 2 3  
>> 1 ...
>> $ variable: Factor w/ 3 levels "V3","V4","V5": 1 1 1 1 1 1 1 1 1  
>> 1 ...
>> $ value   : int  1 1 1 1 0 1 1 0 1 1 ...
>>
>> > dcast(nams.mlt, V1 ~ V2+variable)
>>     V1 A1_V3 A1_V4 A1_V5 A2_V3 A2_V4 A2_V5 A3_V3 A3_V4 A3_V5
>> 1  John     1     0     1     1     1     1     1     0     0
>> 2  Josh     1     0     0    NA    NA    NA     0     0     0
>> 3  Mary     1     0     1     0     0     1     1     1     0
>> 4 Peter     1     0     0     0     0     1     1     1     1
>>
>> You can always change the names of the dataframe if you want, and  
>> in this case it would be a simple sub() operation. Personally I  
>> would substitute "." rather than "".
>


David Winsemius, MD
West Hartford, CT



More information about the R-help mailing list