[R] data.frame tall skinny transformation

Sat Oct 24 20:39:50 CEST 2009

You probably have other id variables in featureData. Try specifying
the measured variables instead of the id variable(s). See

?melt.data.frame

for details.

-Ista

On Sat, Oct 24, 2009 at 1:58 PM, Michael Jones <mdjones71 at gmail.com> wrote:
> If I make the data from smaller:
>
>  featureDataHead = head(featureData)
>  featureDataHead = featureDataHead[ , 1:4]
>  melt(featureDataHead,id.var='feature',variable_name='cell.line')
>
> It works fine
>
>
>
> On Sat, Oct 24, 2009 at 1:54 PM, Michael Jones <mdjones71 at gmail.com> wrote:
>> Thanks Phil,
>>
>> That worked great for the test case below but when I tried it on a
>> really big data.frame I get the error
>>
>> $ melt(featureData,id.var='feature',variable_name='cell.line')
>> Error in data.frame(ids, x, data[, x]) :
>>  arguments imply differing number of rows: 1312, 1, 0
>>
>> featureData has the same structure as x but just with more 'cell.line'
>> columns and features.
>>
>>
>>
>> On Fri, Oct 23, 2009 at 7:52 PM, Phil Spector <spector at stat.berkeley.edu> wrote:
>>> Michael -
>>>   I think the easiest way is to use the melt function
>>> from the reshape package:
>>>
>>>> x
>>>
>>>   feature  5637 1321N1
>>> 1 feature1 -0.56  -0.93
>>> 2 feature2 -0.91  -0.94
>>> 3 feature3  0.44  -0.25
>>>>
>>>> library(reshape)
>>>> melt(x,id.var='feature',variable_name='cell.line')
>>>
>>>   feature cell.line value
>>> 1 feature1      5637 -0.56
>>> 2 feature2      5637 -0.91
>>> 3 feature3      5637  0.44
>>> 4 feature1    1321N1 -0.93
>>> 5 feature2    1321N1 -0.94
>>> 6 feature3    1321N1 -0.25
>>>
>>>
>>>                                        - Phil Spector
>>>                                         Statistical Computing Facility
>>>                                         Department of Statistics
>>>                                         UC Berkeley
>>>                                         spector at stat.berkeley.edu
>>>
>>>
>>> On Fri, 23 Oct 2009, Michael Jones wrote:
>>>
>>>> Hi,
>>>>
>>>> I have a data.frame that looks something like this.
>>>>
>>>>
>>>> feature   5637                1321N1
>>>> feature1 -0.568750616 -0.934748758
>>>> feature2 -0.913080902 -0.941455172
>>>> feature3 0.442477294 -0.257921866
>>>>
>>>> I want to change it to look like this.
>>>>
>>>> feature      cell.line     value
>>>> feature1    5637   -0.568750616
>>>> feature2    5637   -0.913080902
>>>> feature3    5637   0.442477294
>>>> feature1    1321N1  -0.934748758
>>>> feature2    1321N1 -0.941455172
>>>> feature3    1321N1  -0.257921866
>>>>
>>>>
>>>> I have tried to do it with for loops but it is very slow.
>>>>
>>>> # Make Feature data tall skinny
>>>> tsFeatures = c()
>>>> tsCellLines = c()
>>>> tsValues = c()
>>>>
>>>> for(aFeature in as.character(featureData$feature)){
>>>>     print(aFeature)
>>>>     for(cellLine in cellLines){
>>>>           tsCellLines = c(tsCellLines, as.character(cellLine))
>>>>           tsValues = c(tsValues, as.numeric(subset(featureData,
>>>> feature == aFeature, select = c(which(colnames(featureData) %in%
>>>> cellLine)))))
>>>>           tsFeatures = c(tsFeatures, aFeature)
>>>>     }
>>>> }
>>>> tsFeatureData = data.frame(features = tsFeatures, cell.line =
>>>> tsCellLines, value=tsValues)
>>>>
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Ista Zahn
Graduate student
University of Rochester
Department of Clinical and Social Psychology
http://yourpsyche.org