[R] To Add a variable from Df1 to Df2 which have a same common variable

Marc Schwartz marc_schwartz at me.com
Fri Sep 19 19:46:23 CEST 2014


On Sep 19, 2014, at 12:15 PM, Arnaud Michel <michel.arnaud at cirad.fr> wrote:

> Hello
> I have the two dataframes Df1 and Df2 which have the common variable AgeSexeCadNCad
> I would like to add the new variable Df2$Pourcent which correspond at the value of Df1$AgeSexeCadNCad.
> Thank you for your help.
> Michel
> 
> Df1 <- structure(list(AgeSexeCadNCad = structure(1:36, .Label = c("60-Femme-Cadre",
> "60-Femme-Non Cadre", "60-Homme-Cadre", "60-Homme-Non Cadre",
> "61-Femme-Cadre", "61-Femme-Non Cadre", "61-Homme-Cadre", "61-Homme-Non Cadre",
> "62-Femme-Cadre", "62-Femme-Non Cadre", "62-Homme-Cadre", "62-Homme-Non Cadre",
> "63-Femme-Cadre", "63-Femme-Non Cadre", "63-Homme-Cadre", "63-Homme-Non Cadre",
> "64-Femme-Cadre", "64-Femme-Non Cadre", "64-Homme-Cadre", "64-Homme-Non Cadre",
> "65-Femme-Cadre", "65-Femme-Non Cadre", "65-Homme-Cadre", "65-Homme-Non Cadre",
> "66-Femme-Cadre", "66-Femme-Non Cadre", "66-Homme-Cadre", "66-Homme-Non Cadre",
> "67-Femme-Cadre", "67-Femme-Non Cadre", "67-Homme-Cadre", "67-Homme-Non Cadre",
> "68-Femme-Cadre", "68-Femme-Non Cadre", "68-Homme-Cadre", "68-Homme-Non Cadre"
> ), class = "factor"), Pourcent = c(0.157849638357511, 0.157849638357511,
> 0.0562149664637629, 0.419279916358023, 0.180720729132166, 0.180720729132166,
> 0.092720981524322, 0.272158156192425, 0.145668562090518, 0.145668562090518,
> 0.101319648271574, 0.159207521192769, 0.0997898095090109, 0.0997898095090109,
> 0.110753346057845, 0.0193586234067497, 0.0795236495990374, 0.0795236495990374,
> 0.18014205547984, 0.00968491550180694, 0.0750838561972432, 0.0750838561972432,
> 0.237072554382218, 0.0650665901855087, 0.0587392216209752, 0.0587392216209752,
> 0.126427289344211, 0.00961707878904615, 0.0409034699088397, 0.0409034699088397,
> 0.0537806700836756, 3.11172383820597e-05, 0.0285360029533433,
> 0.0285360029533433, 0.0220930854712636, 2.20203747900568e-09)), .Names = c("AgeSexeCadNCad",
> "Pourcent"), row.names = c(28L, 19L, 10L, 1L, 29L, 20L, 11L,
> 2L, 30L, 21L, 12L, 3L, 31L, 22L, 13L, 4L, 32L, 23L, 14L, 5L,
> 33L, 24L, 15L, 6L, 34L, 25L, 16L, 7L, 35L, 26L, 17L, 8L, 36L,
> 27L, 18L, 9L), class = "data.frame")
> 
> Df2 <- structure(list(Matricule = c(410, 453, 501, 544, 653, 765, 833,
> 851, 927, 1050, 1074, 1278, 1379, 1428, 359, 379, 408, 417, 424,
> 426, 483, 490, 528, 538, 567, 596, 603, 604, 647, 675, 677, 681,
> 735, 743, 787, 817, 823, 896, 917, 1071, 1144, 1157, 1823, 2497,
> 2868, 3556, 3614, 3632, 3646, 3656, 3660, 4162, 4503, 4711, 5531,
> 330, 447, 467, 546, 627, 637, 780, 892, 1487, 1492, 3324, 4873,
> 409, 415, 441, 579, 619, 697, 716, 719, 728, 737, 807, 832, 989,
> 1299, 1320, 1352, 1427, 1484, 1548, 2447, 2914, 2929, 2941, 3524,
> 3527, 3631, 4324, 400, 572, 1095, 1097, 1105, 2966, 392, 418,
> 440, 457, 466, 472, 488, 491, 506, 533, 543, 547, 552, 553, 920,
> 1034, 1179, 1454, 1485, 1540, 3620, 4672, 13899, 342, 1089, 1208,
> 1234, 2153, 3545, 253, 504, 529, 558, 578, 745, 933, 935, 2099,
> 16785, 356, 460, 634, 959, 1429, 1591, 1720, 3602, 3644, 322,
> 361, 404, 430, 525, 706, 804, 1010, 1012, 1108, 1185, 1294, 2264,
> 3567, 3633, 4990, 264, 298, 352, 388, 503, 508, 691, 1509, 2192,
> 3060, 3683, 877, 1130, 1963, 188, 327, 331, 363, 437, 445, 462,
> 723, 1259, 1381, 3617, 427, 1402, 3624, 141, 256, 308, 377, 414,
> 640, 157, 560), AgeSexeCadNCad = c("60-Femme-Non Cadre", "60-Femme-Non Cadre",
> "60-Femme-Non Cadre", "60-Femme-Non Cadre", "60-Femme-Non Cadre",
> "60-Femme-Non Cadre", "60-Femme-Non Cadre", "60-Femme-Non Cadre",
> "60-Femme-Non Cadre", "60-Femme-Non Cadre", "60-Femme-Non Cadre",
> "60-Femme-Non Cadre", "60-Femme-Non Cadre", "60-Femme-Non Cadre",
> "60-Homme-Non Cadre", "60-Homme-Non Cadre", "60-Homme-Non Cadre",
> "60-Homme-Non Cadre", "60-Homme-Non Cadre", "60-Homme-Non Cadre",
> "60-Homme-Non Cadre", "60-Homme-Non Cadre", "60-Homme-Non Cadre",
> "60-Homme-Non Cadre", "60-Homme-Non Cadre", "60-Homme-Non Cadre",
> "60-Homme-Non Cadre", "60-Homme-Non Cadre", "60-Homme-Non Cadre",
> "60-Homme-Non Cadre", "60-Homme-Non Cadre", "60-Homme-Non Cadre",
> "60-Homme-Non Cadre", "60-Homme-Non Cadre", "60-Homme-Non Cadre",
> "60-Homme-Non Cadre", "60-Homme-Non Cadre", "60-Homme-Non Cadre",
> "60-Homme-Non Cadre", "60-Homme-Non Cadre", "60-Homme-Non Cadre",
> "60-Homme-Non Cadre", "60-Homme-Non Cadre", "60-Homme-Non Cadre",
> "60-Homme-Non Cadre", "60-Homme-Non Cadre", "60-Homme-Non Cadre",
> "60-Homme-Non Cadre", "60-Homme-Non Cadre", "60-Homme-Non Cadre",
> "60-Homme-Non Cadre", "60-Homme-Non Cadre", "60-Homme-Non Cadre",
> "60-Homme-Non Cadre", "60-Homme-Non Cadre", "61-Femme-Non Cadre",
> "61-Femme-Non Cadre", "61-Femme-Non Cadre", "61-Femme-Non Cadre",
> "61-Femme-Non Cadre", "61-Femme-Non Cadre", "61-Femme-Non Cadre",
> "61-Femme-Non Cadre", "61-Femme-Non Cadre", "61-Femme-Non Cadre",
> "61-Femme-Non Cadre", "61-Femme-Non Cadre", "61-Homme-Non Cadre",
> "61-Homme-Non Cadre", "61-Homme-Non Cadre", "61-Homme-Non Cadre",
> "61-Homme-Non Cadre", "61-Homme-Non Cadre", "61-Homme-Non Cadre",
> "61-Homme-Non Cadre", "61-Homme-Non Cadre", "61-Homme-Non Cadre",
> "61-Homme-Non Cadre", "61-Homme-Non Cadre", "61-Homme-Non Cadre",
> "61-Homme-Non Cadre", "61-Homme-Non Cadre", "61-Homme-Non Cadre",
> "61-Homme-Non Cadre", "61-Homme-Non Cadre", "61-Homme-Non Cadre",
> "61-Homme-Non Cadre", "61-Homme-Non Cadre", "61-Homme-Non Cadre",
> "61-Homme-Non Cadre", "61-Homme-Non Cadre", "61-Homme-Non Cadre",
> "61-Homme-Non Cadre", "61-Homme-Non Cadre", "62-Femme-Non Cadre",
> "62-Femme-Non Cadre", "62-Femme-Non Cadre", "62-Femme-Non Cadre",
> "62-Femme-Non Cadre", "62-Femme-Non Cadre", "62-Homme-Non Cadre",
> "62-Homme-Non Cadre", "62-Homme-Non Cadre", "62-Homme-Non Cadre",
> "62-Homme-Non Cadre", "62-Homme-Non Cadre", "62-Homme-Non Cadre",
> "62-Homme-Non Cadre", "62-Homme-Non Cadre", "62-Homme-Non Cadre",
> "62-Homme-Non Cadre", "62-Homme-Non Cadre", "62-Homme-Non Cadre",
> "62-Homme-Non Cadre", "62-Homme-Non Cadre", "62-Homme-Non Cadre",
> "62-Homme-Non Cadre", "62-Homme-Non Cadre", "62-Homme-Non Cadre",
> "62-Homme-Non Cadre", "62-Homme-Non Cadre", "62-Homme-Non Cadre",
> "62-Homme-Non Cadre", "63-Femme-Non Cadre", "63-Femme-Non Cadre",
> "63-Femme-Non Cadre", "63-Femme-Non Cadre", "63-Femme-Non Cadre",
> "63-Femme-Non Cadre", "63-Homme-Non Cadre", "63-Homme-Non Cadre",
> "63-Homme-Non Cadre", "63-Homme-Non Cadre", "63-Homme-Non Cadre",
> "63-Homme-Non Cadre", "63-Homme-Non Cadre", "63-Homme-Non Cadre",
> "63-Homme-Non Cadre", "63-Homme-Non Cadre", "64-Femme-Non Cadre",
> "64-Femme-Non Cadre", "64-Femme-Non Cadre", "64-Femme-Non Cadre",
> "64-Femme-Non Cadre", "64-Femme-Non Cadre", "64-Femme-Non Cadre",
> "64-Femme-Non Cadre", "64-Femme-Non Cadre", "64-Homme-Non Cadre",
> "64-Homme-Non Cadre", "64-Homme-Non Cadre", "64-Homme-Non Cadre",
> "64-Homme-Non Cadre", "64-Homme-Non Cadre", "64-Homme-Non Cadre",
> "64-Homme-Non Cadre", "64-Homme-Non Cadre", "64-Homme-Non Cadre",
> "64-Homme-Non Cadre", "64-Homme-Non Cadre", "64-Homme-Non Cadre",
> "64-Homme-Non Cadre", "64-Homme-Non Cadre", "64-Homme-Non Cadre",
> "65-Homme-Non Cadre", "65-Homme-Non Cadre", "65-Homme-Non Cadre",
> "65-Homme-Non Cadre", "65-Homme-Non Cadre", "65-Homme-Non Cadre",
> "65-Homme-Non Cadre", "65-Homme-Non Cadre", "65-Homme-Non Cadre",
> "65-Homme-Non Cadre", "65-Homme-Non Cadre", "66-Femme-Non Cadre",
> "66-Femme-Non Cadre", "66-Femme-Non Cadre", "66-Homme-Non Cadre",
> "66-Homme-Non Cadre", "66-Homme-Non Cadre", "66-Homme-Non Cadre",
> "66-Homme-Non Cadre", "66-Homme-Non Cadre", "66-Homme-Non Cadre",
> "66-Homme-Non Cadre", "66-Homme-Non Cadre", "66-Homme-Non Cadre",
> "66-Homme-Non Cadre", "67-Homme-Non Cadre", "67-Homme-Non Cadre",
> "67-Homme-Non Cadre", "68-Homme-Non Cadre", "68-Homme-Non Cadre",
> "68-Homme-Non Cadre", "68-Homme-Non Cadre", "68-Homme-Non Cadre",
> "68-Homme-Non Cadre", "69-Homme-Non Cadre", "69-Homme-Non Cadre"
> )), .Names = c("Matricule", "AgeSexeCadNCad"), class = "data.frame", row.names = c("37",
> "58", "79", "104", "163", "220", "263", "276", "333", "422",
> "442", "587", "653", "684", "21", "25", "35", "42", "45", "47",
> "73", "76", "93", "100", "118", "133", "137", "138", "158", "174",
> "176", "179", "204", "208", "231", "249", "254", "312", "325",
> "439", "491", "500", "825", "928", "954", "1093", "1116", "1128",
> "1136", "1141", "1143", "1212", "1232", "1270", "1396", "14",
> "56", "66", "106", "148", "153", "226", "308", "717", "720",
> "1046", "1287", "36", "41", "54", "124", "144", "188", "197",
> "198", "201", "206", "242", "262", "377", "598", "611", "633",
> "683", "714", "742", "919", "980", "993", "1000", "1071", "1073",
> "1127", "1223", "32", "121", "456", "458", "462", "1013", "27",
> "43", "53", "59", "65", "67", "75", "77", "83", "97", "103",
> "107", "109", "110", "328", "412", "516", "698", "715", "740",
> "1122", "1267", "1824", "16", "452", "540", "557", "870", "1086",
> "5", "82", "94", "115", "123", "209", "339", "341", "862", "2211",
> "20", "61", "152", "358", "685", "760", "803", "1111", "1134",
> "11", "22", "33", "49", "92", "193", "241", "394", "396", "463",
> "522", "595", "896", "1097", "1129", "1302", "7", "9", "18",
> "26", "81", "85", "185", "728", "884", "1029", "1155", "297",
> "479", "842", "3", "13", "15", "23", "51", "55", "63", "199",
> "574", "655", "1119", "48", "668", "1125", "1", "6", "10", "24",
> "40", "154", "2", "117"))


Hi,

Thanks for including data.

See ?merge, which performs an SQL-like join.

Since you have non-matching values between Df1 and Df2, you will need to decide if you want non-matching rows included in the resultant data frame or not (eg. a right/left outer or inner join). See the all, all.x and all.y arguments to merge(). 

The default (all = FALSE) is an inner join on matching rows only:

  Df3 <- merge(Df1, Df2, by = "AgeSexeCadNCad")

If you include non-matching values in the resultant data frame (eg. all = TRUE), Pourcent will contains NA's in those rows.

Regards,

Marc Schwartz



More information about the R-help mailing list