[R] Reshaping matrix of vectors as dataframe
William Dunlap
wdunlap at tibco.com
Sun Jan 31 19:47:43 CET 2010
> -----Original Message-----
> From: r-help-bounces at r-project.org
> [mailto:r-help-bounces at r-project.org] On Behalf Of Oliver Gondring
> Sent: Sunday, January 31, 2010 6:53 AM
> To: r-help at r-project.org
> Subject: [R] Reshaping matrix of vectors as dataframe
>
> Dear R people,
>
> I have to deal with the output of a function which comes as a
> matrix of
> vectors.
> You can reproduce the structure as given below:
>
> x <- list(c(1,2,4),c(1,3,5),c(0,1,0),
> c(1,3,6,5),c(3,4,4,4),c(0,1,0,1),
> c(3,7),c(1,2),c(0,1))
> data <- matrix(x,byrow=TRUE,nrow=3)
> colnames(data) <- c("First", "Length", "Value")
> rownames(data) <- c("Case1", "Case2", "Case3")
>
> > data
> First Length Value
> Case1 Numeric,3 Numeric,3 Numeric,3
> Case2 Numeric,4 Numeric,4 Numeric,4
> Case3 Numeric,2 Numeric,2 Numeric,2
>
> > data["Case1",]
> $First
> [1] 1 2 4
>
> $Length
> [1] 1 3 5
>
> $Value
> [1] 0 1 0
> --------------------
>
> My goal now is to break the three vectors of each row of the
> matrix into
> their elements, assigning each element to a certain
> "Sequence" (which I
> want to be numbered according to the position of the corresponding
> values within the vectors), reshaping the whole as a data
> frame like this:
>
> Case Sequence First Length Value
>
> Case1 1 1 1 0
> Case1 2 2 3 1
> Case1 3 4 5 0
>
> Case2 1 1 3 0
> Case2 2 3 4 1
> Case2 3 6 4 0
> Case2 4 5 4 1
>
> Case3 1 3 1 0
> Case3 2 7 2 1
The following is not terribly elegant, but is
pretty easy to understand.
> lengths<-sapply(data[,1],length)
> data.frame(Case=rep(rownames(data),lengths),
+ Sequence=sequence(lengths),
+ apply(data,2,unlist),
+ row.names=NULL)
Case Sequence First Length Value
1 Case1 1 1 1 0
2 Case1 2 2 3 1
3 Case1 3 4 5 0
4 Case2 1 1 3 0
5 Case2 2 3 4 1
6 Case2 3 6 4 0
7 Case2 4 5 4 1
8 Case3 1 3 1 0
9 Case3 2 7 2 1
It assumes that sapply(data[,k],length) is the
same for all k in 1:ncol(data). If you do this
often put it into a function that makes that
check.
It uses the much-maligned (by me) apply() function so
it wastes effort "simplifying" the results of unlist
into the columns of a matrix that data.frame() will
immediately pull apart into columns. The following
avoids apply() but is wordier
> data.frame(Case=rep(rownames(data),lengths),
+ Sequence=sequence(lengths),
+ lapply(split(data,colnames(data)[col(data)]), unlist),
+ row.names=NULL)
Case Sequence First Length Value
1 Case1 1 1 1 0
2 Case1 2 2 3 1
3 Case1 3 4 5 0
4 Case2 1 1 3 0
5 Case2 2 3 4 1
6 Case2 3 6 4 0
7 Case2 4 5 4 1
8 Case3 1 3 1 0
9 Case3 2 7 2 1
Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com
>
> I suspect that there might be an elegant and not too
> complicated way to
> do this with one or several of the functions provided by the
> 'reshape'
> package, but due to my lack of experience with R in general, this
> package in particular and the complexity of the task I wasn't able to
> figure out how to do it so far.
>
> Every hint or helpful comment is much appreciated!
>
> Oliver
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
More information about the R-help
mailing list