[R] Creating one df from 85 df present in a list
Rasmus Liland
jr@| @end|ng |rom po@teo@no
Sat Jun 13 01:54:17 CEST 2020
On 2020-06-10 13:14 -0700, Bert Gunter wrote:
> On Wed, Jun 10, 2020 at 11:48 AM Alejandro Ureta wrote:
> >
> > hi, I am trying to fuse (cbind, merge...
> > NOT rbind) several dataframes with
> > different numbers of rows, all df
> > included in a list, and using the code
> > extract shown below. The function merge()
> > works well with two df but not more than
> > two...I have 85 dataframes to join in
> > this way (85 df in the list)....could you
> > please let me know how to get all 85 df
> > merged ?,,,,, thanks
> >
> > fusion_de_tablas = merge(red_tablas_por_punto[["1 - Bv.Artigas y la Rambla
> > (Terminal CUTCSA)"]],
> > red_tablas_por_punto[["10 - Avenida Millán 2515 (Hospital Vilardebó)"]],
> > red_tablas_por_punto[["100 - Fauquet 6358 (Hospital Saint Bois)"]],
> > by= 'toma_de_muestras', all = T )
>
> ?do.call -- takes a list of arguments to a function
> ... as in
> do.call(merge, yourlist) ## or similar perhaps
Dear Alejandro,
it would be easier to help you if you
provided some example of how fusion_de_tablas
looks like.
Here is a small example on uniting some odd
sized dataframes with some common and some
differently named columns.
red_tablas_por_punto <-
list(
"1 - Bv.Artigas y la Rambla (Terminal CUTCSA)" =
data.frame("a"=1:3,
"b"=4:6,
"c"=4:6,
'toma_de_muestras'=1),
"10 - Avenida Millán 2515 (Hospital Vilardebó)" =
data.frame("d"=4:8,
"b"=8:12,
'toma_de_muestras'=7),
"100 - Fauquet 6358 (Hospital Saint Bois)" =
data.frame("e"=100:101,
"a"=85:86,
'toma_de_muestras'=4)
)
unified.df <- lapply(names(red_tablas_por_punto),
function(tabla, cn) {
x <- red_tablas_por_punto[[tabla]]
x[,cn[!(cn %in% colnames(x))]] <- NA
x <- x[,cn]
x$tabla <- tabla
return(x)
}, cn=unique(unlist(lapply(red_tablas_por_punto, colnames))))
unified.df <- do.call(rbind, unified.df)
unified.df
which yields
a b c toma_de_muestras d e tabla
1 1 4 4 1 NA NA 1 - Bv.Artigas y la Rambla (Terminal CUTCSA)
2 2 5 5 1 NA NA 1 - Bv.Artigas y la Rambla (Terminal CUTCSA)
3 3 6 6 1 NA NA 1 - Bv.Artigas y la Rambla (Terminal CUTCSA)
4 NA 8 NA 7 4 NA 10 - Avenida Millán 2515 (Hospital Vilardebó)
5 NA 9 NA 7 5 NA 10 - Avenida Millán 2515 (Hospital Vilardebó)
6 NA 10 NA 7 6 NA 10 - Avenida Millán 2515 (Hospital Vilardebó)
7 NA 11 NA 7 7 NA 10 - Avenida Millán 2515 (Hospital Vilardebó)
8 NA 12 NA 7 8 NA 10 - Avenida Millán 2515 (Hospital Vilardebó)
9 85 NA NA 4 NA 100 100 - Fauquet 6358 (Hospital Saint Bois)
10 86 NA NA 4 NA 101 100 - Fauquet 6358 (Hospital Saint Bois)
I also found that [1] you could use merge
like you tried with Reduce, like
Reduce(function(x, y)
merge(x, y, by='toma_de_muestras', all=T),
red_tablas_por_punto)
which yields
toma_de_muestras a.x b.x c d b.y e a.y
1 10001 1 4 4 NA NA NA NA
2 10002 2 5 5 NA NA NA NA
3 10003 3 6 6 NA NA NA NA
4 10004 NA NA NA 4 8 NA NA
5 10005 NA NA NA 5 9 NA NA
6 10006 NA NA NA 6 10 NA NA
7 10007 NA NA NA 7 11 NA NA
8 10008 NA NA NA 8 12 NA NA
9 10009 NA NA NA NA NA 100 85
10 10010 NA NA NA NA NA 101 86
where the semi-common “a” column does not
become unified ... thus, I like my initial
step-by-step apply-based solution better ...
Best,
Rasmus
[1] https://stackoverflow.com/questions/22644780/merging-multiple-csv-files-in-r-using-do-call
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: not available
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20200613/f877bdec/attachment.sig>
More information about the R-help
mailing list