[R] Reshaping matrix of lists as dataframe

Henrique Dallazuanna wwwhsd at gmail.com
Mon Feb 1 12:13:26 CET 2010


You can try this also:

m <- do.call(rbind, sapply(split(x, rep(seq(length(x)/3), each = 3)),
do.call, what = cbind))
dimnames(m) <- list(paste("Case", rep(1:3, unique(sapply(x, length))),
sep = ""), c("First", "Length", "Value"))

On Mon, Feb 1, 2010 at 5:58 AM, Oliver Gondring <olihui at gmx.de> wrote:
> Hello William, hello David,
>
>  thanks a lot for helping and keeping me going on what sometimes seems to be
> a long way to R mastery! :)
>
> I found that the two solutions William proposed were in fact easier to
> understand for me at the moment as David's (and has the additional advantage
> of producing the desired data types ('numeric'/'integer') in the columns
> 2-5), however I think all of the code you provided will be extremely helpful
> to learn some new tricks by analyzing it in detail.
>
> For everyone concerned with similar data manipulation tasks, here's a short
> summary of the thread:
>
>>>> The original data (a matrix of _lists_, of cours - mea culpa - hence the
>>>> modified name of the thread):
>
> x <- list(c(1,2,4),c(1,3,5),c(0,1,0),
>         c(1,3,6,5),c(3,4,4,4),c(0,1,0,1),
>         c(3,7),c(1,2),c(0,1))
> data <- matrix(x,byrow=TRUE,nrow=3)
> colnames(data) <- c("First", "Length", "Value")
> rownames(data) <- c("Case1", "Case2", "Case3")
>
>> data
>     First     Length    Value
> Case1 Numeric,3 Numeric,3 Numeric,3
> Case2 Numeric,4 Numeric,4 Numeric,4
> Case3 Numeric,2 Numeric,2 Numeric,2
>
>
>>>> The desired output (a dataframe of a database-like 'flat' structure):
>
>>      Case Sequence First Length Value
>>   1 Case1        1     1      1     0
>>   2 Case1        2     2      3     1
>>   3 Case1        3     4      5     0
>>   4 Case2        1     1      3     0
>>   5 Case2        2     3      4     1
>>   6 Case2        3     6      4     0
>>   7 Case2        4     5      4     1
>>   8 Case3        1     3      1     0
>>   9 Case3        2     7      2     1
>
>
>>>> Ways to do it:
>
> (1)
>>  lengths<-sapply(data[,1],length)
>>  data.frame(Case=rep(rownames(data),lengths),
>             Sequence=sequence(lengths),             apply(data,2,unlist),
>             row.names=NULL)
>
>> It assumes that sapply(data[,k],length) is the
>> same for all k in 1:ncol(data).
>
> Which is, as you inferred correctly from the given example dataset (because
> I forgot to mention explicitly), is always the case.
>
> (2)
>> data.frame(Case=rep(rownames(data),lengths),
>            Sequence=sequence(lengths),
>            lapply(split(data,colnames(data)[col(data)]), unlist),
>            row.names=NULL)
>
> (3)
> (David's code with some additions to produce nearly the same output as (1)
> and (2))
> (however there's still one difference: columns 2-5 are 'factors')
>> result <- data.frame(do.call(rbind,
>        sapply(rownames(data),  function(.x) cbind(.x,
>        # those were the rownames
>        cbind(1:length(data[.x, "First"][[1]]),
>        # and that was the incremental counter
>        sapply(data[.x, ],
>        # and finally the values which unfortunately get turned into
> characters
>        function(.y) return(.y ) ) ) )  )))
>> colnames(result)[1:2] <- c("Case","Sequence")
>> result
>
> Cheers,
> Oliver
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40" S 49° 16' 22" O



More information about the R-help mailing list