[R] data.frame tall skinny transformation

Michael Jones mdjones71 at gmail.com
Sat Oct 24 19:58:05 CEST 2009


If I make the data from smaller:

 featureDataHead = head(featureData)
 featureDataHead = featureDataHead[ , 1:4]
 melt(featureDataHead,id.var='feature',variable_name='cell.line')

It works fine



On Sat, Oct 24, 2009 at 1:54 PM, Michael Jones <mdjones71 at gmail.com> wrote:
> Thanks Phil,
>
> That worked great for the test case below but when I tried it on a
> really big data.frame I get the error
>
> $ melt(featureData,id.var='feature',variable_name='cell.line')
> Error in data.frame(ids, x, data[, x]) :
>  arguments imply differing number of rows: 1312, 1, 0
>
> featureData has the same structure as x but just with more 'cell.line'
> columns and features.
>
>
>
> On Fri, Oct 23, 2009 at 7:52 PM, Phil Spector <spector at stat.berkeley.edu> wrote:
>> Michael -
>>   I think the easiest way is to use the melt function
>> from the reshape package:
>>
>>> x
>>
>>   feature  5637 1321N1
>> 1 feature1 -0.56  -0.93
>> 2 feature2 -0.91  -0.94
>> 3 feature3  0.44  -0.25
>>>
>>> library(reshape)
>>> melt(x,id.var='feature',variable_name='cell.line')
>>
>>   feature cell.line value
>> 1 feature1      5637 -0.56
>> 2 feature2      5637 -0.91
>> 3 feature3      5637  0.44
>> 4 feature1    1321N1 -0.93
>> 5 feature2    1321N1 -0.94
>> 6 feature3    1321N1 -0.25
>>
>>
>>                                        - Phil Spector
>>                                         Statistical Computing Facility
>>                                         Department of Statistics
>>                                         UC Berkeley
>>                                         spector at stat.berkeley.edu
>>
>>
>> On Fri, 23 Oct 2009, Michael Jones wrote:
>>
>>> Hi,
>>>
>>> I have a data.frame that looks something like this.
>>>
>>>
>>> feature   5637                1321N1
>>> feature1 -0.568750616 -0.934748758
>>> feature2 -0.913080902 -0.941455172
>>> feature3 0.442477294 -0.257921866
>>>
>>> I want to change it to look like this.
>>>
>>> feature      cell.line     value
>>> feature1    5637   -0.568750616
>>> feature2    5637   -0.913080902
>>> feature3    5637   0.442477294
>>> feature1    1321N1  -0.934748758
>>> feature2    1321N1 -0.941455172
>>> feature3    1321N1  -0.257921866
>>>
>>>
>>> I have tried to do it with for loops but it is very slow.
>>>
>>> # Make Feature data tall skinny
>>> tsFeatures = c()
>>> tsCellLines = c()
>>> tsValues = c()
>>>
>>> for(aFeature in as.character(featureData$feature)){
>>>     print(aFeature)
>>>     for(cellLine in cellLines){
>>>           tsCellLines = c(tsCellLines, as.character(cellLine))
>>>           tsValues = c(tsValues, as.numeric(subset(featureData,
>>> feature == aFeature, select = c(which(colnames(featureData) %in%
>>> cellLine)))))
>>>           tsFeatures = c(tsFeatures, aFeature)
>>>     }
>>> }
>>> tsFeatureData = data.frame(features = tsFeatures, cell.line =
>>> tsCellLines, value=tsValues)
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>




More information about the R-help mailing list