[Bioc-devel] DataFrameList to Wide Format DataFrame

Michael Lawrence |@wrence@m|ch@e| @end|ng |rom gene@com
Fri Dec 17 11:43:42 CET 2021


This is more of a support site question.

The stack() function is relevant here, but it won't fill in the missing columns.

Note though that there are some conveniences that might help a tiny
bit, like how colnames(DFL) returns a CharacterList, so you can do
unique(unlist(colnames(DFL))).

In theory we could make [<-() on a DataFrameList behave more like its
SplitDataFrameList derivative and insert columns into each of its
elements, so you could do something like:

DFL[,psetdiff(unique(unlist(colnames(DFL))), colnames(DFL))] <- NA

I don't know if psetdiff() would work in that way, but it could.

Michael

On Thu, Dec 16, 2021 at 11:01 PM Dario Strbenac via Bioc-devel
<bioc-devel using r-project.org> wrote:
>
> Hello,
>
> Ah, yes, the sample names should of course be in the rows - Friday afternoon error. In the question, I specified "largely the same set of features", implying that the overlap is not complete. So, the example below will error.
>
> DFL <- DataFrameList(X = DataFrame(a = 1:3, b = 3:1, row.names = LETTERS[1:3]),
>                      Y = DataFrame(b = 4:6, c = 6:4, row.names = LETTERS[20:22]))
> unlist(DFL)
> Error in .aggregate_and_align_all_colnames(all_colnames, strict.colnames = strict.colnames) :
>   the DFrame objects to combine must have the same column names
>
> This is long but works:
>
> allFeatures <- unique(unlist(lapply(DFL, colnames)))
> DFL <- lapply(DFL, function(DF)
> {
>   missingFeatures <- setdiff(allFeatures, colnames(DF))
>   DF[missingFeatures] <- NA
>   DF
> })
> DFLflattened <- do.call(rbind, DFL)
>
> Is there a one-line function for it?
>
> --------------------------------------
> Dario Strbenac
> University of Sydney
> Camperdown NSW 2050
> Australia
> _______________________________________________
> Bioc-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel



-- 
Michael Lawrence
Principal Scientist, Director of Data Science and Statistical Computing
Genentech, A Member of the Roche Group
Office +1 (650) 225-7760
michafla using gene.com

Join Genentech on LinkedIn | Twitter | Facebook | Instagram | YouTube



More information about the Bioc-devel mailing list