[R] data.frame tall skinny transformation
Michael Jones
mdjones71 at gmail.com
Sat Oct 24 19:58:05 CEST 2009
If I make the data from smaller:
featureDataHead = head(featureData)
featureDataHead = featureDataHead[ , 1:4]
melt(featureDataHead,id.var='feature',variable_name='cell.line')
It works fine
On Sat, Oct 24, 2009 at 1:54 PM, Michael Jones <mdjones71 at gmail.com> wrote:
> Thanks Phil,
>
> That worked great for the test case below but when I tried it on a
> really big data.frame I get the error
>
> $ melt(featureData,id.var='feature',variable_name='cell.line')
> Error in data.frame(ids, x, data[, x]) :
> arguments imply differing number of rows: 1312, 1, 0
>
> featureData has the same structure as x but just with more 'cell.line'
> columns and features.
>
>
>
> On Fri, Oct 23, 2009 at 7:52 PM, Phil Spector <spector at stat.berkeley.edu> wrote:
>> Michael -
>> I think the easiest way is to use the melt function
>> from the reshape package:
>>
>>> x
>>
>> feature 5637 1321N1
>> 1 feature1 -0.56 -0.93
>> 2 feature2 -0.91 -0.94
>> 3 feature3 0.44 -0.25
>>>
>>> library(reshape)
>>> melt(x,id.var='feature',variable_name='cell.line')
>>
>> feature cell.line value
>> 1 feature1 5637 -0.56
>> 2 feature2 5637 -0.91
>> 3 feature3 5637 0.44
>> 4 feature1 1321N1 -0.93
>> 5 feature2 1321N1 -0.94
>> 6 feature3 1321N1 -0.25
>>
>>
>> - Phil Spector
>> Statistical Computing Facility
>> Department of Statistics
>> UC Berkeley
>> spector at stat.berkeley.edu
>>
>>
>> On Fri, 23 Oct 2009, Michael Jones wrote:
>>
>>> Hi,
>>>
>>> I have a data.frame that looks something like this.
>>>
>>>
>>> feature 5637 1321N1
>>> feature1 -0.568750616 -0.934748758
>>> feature2 -0.913080902 -0.941455172
>>> feature3 0.442477294 -0.257921866
>>>
>>> I want to change it to look like this.
>>>
>>> feature cell.line value
>>> feature1 5637 -0.568750616
>>> feature2 5637 -0.913080902
>>> feature3 5637 0.442477294
>>> feature1 1321N1 -0.934748758
>>> feature2 1321N1 -0.941455172
>>> feature3 1321N1 -0.257921866
>>>
>>>
>>> I have tried to do it with for loops but it is very slow.
>>>
>>> # Make Feature data tall skinny
>>> tsFeatures = c()
>>> tsCellLines = c()
>>> tsValues = c()
>>>
>>> for(aFeature in as.character(featureData$feature)){
>>> print(aFeature)
>>> for(cellLine in cellLines){
>>> tsCellLines = c(tsCellLines, as.character(cellLine))
>>> tsValues = c(tsValues, as.numeric(subset(featureData,
>>> feature == aFeature, select = c(which(colnames(featureData) %in%
>>> cellLine)))))
>>> tsFeatures = c(tsFeatures, aFeature)
>>> }
>>> }
>>> tsFeatureData = data.frame(features = tsFeatures, cell.line =
>>> tsCellLines, value=tsValues)
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>
More information about the R-help
mailing list