[R] To build a new Df from 2 Df

Wed Oct 15 08:51:16 CEST 2014

Thank you David
Now, the problem is to list all the combinations which verify the 
condition III (ie every Rapporteur has to have more or less the same 
number of demandeur)
Have you any idea ?
Michel

Le 14/10/2014 13:18, David.Kaethner at dlr.de a écrit :
> Hello,
>
> here's a draft of a solution. I hope it's not overly complicated.
>
> # find all possible combinations
> combi <- expand.grid(Dem$Nom, Rap$Nom); names(combi) <- c("Dem", "Rap")
>
> # we need the corresponding departments and units
> combi$DemDep <- apply(combi, 1, function(x) Dem$Departement[x[1] == Dem$Nom])
> combi$DemUni <- apply(combi, 1, function(x) Dem$Unite[x[1] == Dem$Nom])
> combi$RapDep <- apply(combi, 1, function(x) Rap$Departement[x[2] == Rap$Nom])
> combi$RapUni <- apply(combi, 1, function(x) Rap$Unite[x[2] == Rap$Nom])
>
> # we exclude the combinations that we don't want
> dep <- combi[combi$DemDep != combi$RapDep, c("Dem", "Rap")]
> dep$id <- as.numeric(dep$Rap)
> uni <- combi[combi$DemUni != combi$RapUni, c("Dem", "Rap")]
> uni$id <- as.numeric(uni$Rap)
>
> # preliminary result
> resDep <- reshape(dep,
>          timevar = "id",
>          idvar = "Dem",
>          direction = "wide"
> )
>
> resUni <- reshape(uni,
>                    timevar = "id",
>                    idvar = "Dem",
>                    direction = "wide"
> )
>
> In resDep and resUni you find the results for Rapporteur1 and Rapporteur2. NAs indicate where conditions did not match. For Rap1/Rap2 you can now choose any column from resDep and resUni that is not NA for that specific Demandeur. I wasn't exactly sure about your third condition, so I'll leave that to you. But with the complete possible matches, you have a more general solution.
>
> Btw, you can construct data.frames just like this:
>
> Dem <- data.frame(
>    Nom = c("John", "Jim", "Julie", "Charles", "Michel", "Emma", "Sandra", "Elodie", "Thierry", "Albert", "Jean", "Francois", "Pierre", "Cyril", "Damien", "Jean-Michel", "Vincent", "Daniel", "Yvan", "Catherine"),
>    Departement = c("D", "A", "A", "C", "D", "B", "D", "B", "C", "D", "B", "B", "B", "A", "C", "D", "B", "A", "D", "D"),
>    Unite = c("Unite8", "Unite4", "Unite4", "Unite7", "Unite9", "Unite1", "Unite6", "Unite5", "Unite7", "Unite3", "Unite2", "Unite6", "Unite8", "Unite8", "Unite3", "Unite8", "Unite9", "Unite7", "Unite9", "Unite5")
> )
>
> -dk
>
> -----Ursprüngliche Nachricht-----
> Von: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] Im Auftrag von Arnaud Michel
> Gesendet: Dienstag, 14. Oktober 2014 10:46
> An: r-help at r-project.org
> Betreff: [R] To build a new Df from 2 Df
>
> Hello
>
> I have 2 df Dem and Rap.
> I would want to build all the df (dfnew) by associating these two df (Dem and Rap) in the following way :
>
> For each value of Dem$Nom (dfnew$Demandeur), I associate 2 different values of Rap$Nom (dfnew$Rapporteur1 and dfnew$Rapporteur2) in such a way
>
>    * for each dfnew$Demandeur, dfnew$Rapporteur1 does not have the same
>      value for Departement as Dem$Departement
>    * for each dfnew$Demandeur, dfnew$Rapporteur2 does not have the same
>      value for Unite as Dem$Unite
>    * the value of table(dfnew$Rapporteur1) and the value of
>      table(dfnew$Rapporteur2) must be balanced and not too different
>      (Accepted differences : 1)
>
> table(dfnew$Rapporteur1)
> Rapporteur01 Rapporteur02 Rapporteur03 Rapporteur04 Rapporteur05
>              4                   4 4                      4
>     4
>
> Thanks for your help
> Michel
>
>    Dem <- structure(list(Nom = c("John", "Jim", "Julie", "Charles", "Michel", "Emma", "Sandra", "Elodie", "Thierry", "Albert", "Jean", "Francois", "Pierre", "Cyril", "Damien", "Jean-Michel", "Vincent", "Daniel", "Yvan", "Catherine"), Departement = c("D", "A", "A", "C", "D", "B", "D", "B", "C", "D", "B", "B", "B", "A", "C", "D", "B", "A", "D", "D"), Unite = c("Unite8", "Unite4", "Unite4", "Unite7", "Unite9", "Unite1", "Unite6", "Unite5", "Unite7", "Unite3", "Unite2", "Unite6", "Unite8", "Unite8", "Unite3", "Unite8", "Unite9", "Unite7", "Unite9", "Unite5")), .Names = c("Nom", "Departement", "Unite"
> ), row.names = c(NA, -20L), class = "data.frame")
>
> Rap <- structure(list(Nom = c("Rapporteur01", "Rapporteur02", "Rapporteur03", "Rapporteur04", "Rapporteur05"), Departement = c("C", "D", "C", "C", "D"), Unite = c("Unite10", "Unite6", "Unite5", "Unite5", "Unite4")), .Names = c("Nom", "Departement", "Unite"), row.names = c(NA, -5L), class = "data.frame")
>
> dfnew <- structure(list(Demandeur = structure(c(13L, 12L, 14L, 3L, 15L, 8L, 17L, 7L, 18L, 1L, 10L, 9L, 16L, 4L, 5L, 11L, 19L, 6L, 20L, 2L), .Label = c("Albert", "Catherine", "Charles", "Cyril", "Damien", "Daniel", "Elodie", "Emma", "Francois", "Jean", "Jean-Michel", "Jim", "John", "Julie", "Michel", "Pierre", "Sandra", "Thierry", "Vincent", "Yvan"), class = "factor"), Rapporteur1 = structure(c(3L, 1L, 3L, 5L, 1L, 5L, 1L, 2L, 5L, 4L, 2L, 4L, 2L, 3L, 5L, 4L, 4L, 2L, 3L, 1L), .Label = c("Rapporteur01", "Rapporteur02", "Rapporteur03", "Rapporteur04", "Rapporteur05"), class = "factor"), Rapporteur2 = structure(c(1L, 3L, 4L, 4L, 2L, 4L, 5L, 1L, 2L, 3L, 3L, 3L, 5L, 5L, 1L, 1L, 2L, 5L, 4L, 2L), .Label = c("Rapporteur01", "Rapporteur02", "Rapporteur03", "Rapporteur04", "Rapporteur05"), class = "factor")), .Names = c("Demandeur", "Rapporteur1", "Rapporteur2"), row.names = c(NA, -20L), class =
> "data.frame")
>
>
> --
> Michel ARNAUD
> Cirad Montpellier
>
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>

-- 
Michel ARNAUD
Cirad Montpellier