[R] Convert list of data frames to one data frame
David Winsemius
dw|n@em|u@ @end|ng |rom comc@@t@net
Fri Jun 29 21:49:15 CEST 2018
> On Jun 29, 2018, at 7:28 AM, Sarah Goslee <sarah.goslee using gmail.com> wrote:
>
> Hi,
>
> It isn't super clear to me what you're after.
Agree.
Had a different read of ht erequest. Thought the request was for a first step that "harmonized" the names of the columns and then used `dplyr::bind_rows`:
library(dplyr)
newList <- lapply( employees4List, 'names<-', names(employees4List[[1]]) )
bind_rows(newList)
#---------
first1 second1
1 Al Jones
2 Al2 Jones
3 Barb Smith
4 Al3 Jones
5 Barbara Smith
6 Carol Adams
7 Al Jones2
Might want to wrap suppressWarnings around the right side of that assignment since there were many warnings regarding incongruent factor levels.
--
David.
> Is this what you intend?
>
>> dfbycol(employees4BList)
> first1 last1 first2 last2 first3 last3
> 1 Al Jones <NA> <NA> <NA> <NA>
> 2 Al Jones Barb Smith <NA> <NA>
> 3 Al Jones Barb Smith Carol Adams
> 4 Al Jones <NA> <NA> <NA> <NA>
>>
>> dfbycol(employees4List)
> first1 last1 first2 last2 first3 last3
> 1 Al Jones <NA> <NA> <NA> <NA>
> 2 Al2 Jones Barb Smith <NA> <NA>
> 3 Al3 Jones Barbara Smith Carol Adams
> 4 Al Jones2 <NA> <NA> <NA> <NA>
>
>
> If so:
>
> employees4BList = list(
> data.frame(first1 = "Al", second1 = "Jones"),
> data.frame(first1 = c("Al", "Barb"), second1 = c("Jones", "Smith")),
> data.frame(first1 = c("Al", "Barb", "Carol"), second1 = c("Jones",
> "Smith", "Adams")),
> data.frame(first1 = ("Al"), second1 = "Jones"))
>
> employees4List = list(
> data.frame(first1 = ("Al"), second1 = "Jones"),
> data.frame(first2 = c("Al2", "Barb"), second2 = c("Jones", "Smith")),
> data.frame(first3 = c("Al3", "Barbara", "Carol"), second3 = c("Jones",
> "Smith", "Adams")),
> data.frame(first4 = ("Al"), second4 = "Jones2"))
>
> ###
>
> dfbycol <- function(x) {
> x <- lapply(x, function(y)as.vector(t(as.matrix(y))))
> x <- lapply(x, function(y){length(y) <- max(sapply(x, length)); y})
> x <- do.call(rbind, x)
> x <- data.frame(x, stringsAsFactors=FALSE)
> colnames(x) <- paste0(c("first", "last"), rep(seq(1, ncol(x)/2), each=2))
> x
> }
>
> ###
>
> dfbycol(employees4BList)
>
> dfbycol(employees4List)
>
> On Fri, Jun 29, 2018 at 2:36 AM, Ira Sharenow via R-help
> <r-help using r-project.org> wrote:
>> I have a list of data frames which I would like to combine into one data
>> frame doing something like rbind. I wish to combine in column order and
>> not by names. However, there are issues.
>>
>> The number of columns is not the same for each data frame. This is an
>> intermediate step to a problem and the number of columns could be
>> 2,4,6,8,or10. There might be a few thousand data frames. Another problem
>> is that the names of the columns produced by the first step are garbage.
>>
>> Below is a method that I obtained by asking a question on stack
>> overflow. Unfortunately, my example was not general enough. The code
>> below works for the simple case where the names of the people are
>> consistent. It does not work when the names are realistically not the same.
>>
>> https://stackoverflow.com/questions/50807970/converting-a-list-of-data-frames-not-a-simple-rbind-second-row-to-new-columns/50809432#50809432
>>
>>
>> Please note that the lapply step sets things up except for the column
>> name issue. If I could figure out a way to change the column names, then
>> the bind_rows step will, I believe, work.
>>
>> So I really have two questions. How to change all column names of all
>> the data frames and then how to solve the original problem.
>>
>> # The non general case works fine. It produces one data frame and I can
>> then change the column names to
>>
>> # c("first1", "last1","first2", "last2","first3", "last3",)
>>
>> #Non general easy case
>>
>> employees4BList = list(data.frame(first1 = "Al", second1 = "Jones"),
>>
>> data.frame(first1 = c("Al", "Barb"), second1 = c("Jones", "Smith")),
>>
>> data.frame(first1 = c("Al", "Barb", "Carol"), second1 = c("Jones",
>> "Smith", "Adams")),
>>
>> data.frame(first1 = ("Al"), second1 = "Jones"))
>>
>> employees4BList
>>
>> bind_rows(lapply(employees4BList, function(x) rbind.data.frame(c(t(x)))))
>>
>> # This produces a nice list of data frames, except for the names
>>
>> lapply(employees4BList, function(x) rbind.data.frame(c(t(x))))
>>
>> # This list is a disaster. I am looking for a solution that works in
>> this case.
>>
>> employees4List = list(data.frame(first1 = ("Al"), second1 = "Jones"),
>>
>> data.frame(first2 = c("Al2", "Barb"), second2 = c("Jones", "Smith")),
>>
>> data.frame(first3 = c("Al3", "Barbara", "Carol"), second3 = c("Jones",
>> "Smith", "Adams")),
>>
>> data.frame(first4 = ("Al"), second4 = "Jones2"))
>>
>> bind_rows(lapply(employees4List, function(x) rbind.data.frame(c(t(x)))))
>>
>> Thanks.
>>
>> Ira
>>
>
> --
> Sarah Goslee
> http://www.functionaldiversity.org
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
David Winsemius
Alameda, CA, USA
'Any technology distinguishable from magic is insufficiently advanced.' -Gehm's Corollary to Clarke's Third Law
More information about the R-help
mailing list