[R] applying a set of rules to each row
Bert Gunter
gunter.berton at gene.com
Wed Jan 26 21:31:14 CET 2011
If I understand you correctly, you want ?ifelse, which works on the
full logical vectors of rules applied to the variables, not
if....else, which works on only a single logical.
-- Bert Gunter
On Wed, Jan 26, 2011 at 12:18 PM, KATSCHKE, ADRIAN CIV DFAS
<ADRIAN.KATSCHKE at dfas.mil> wrote:
> All,
>
> I would like to apply a set of rules to each row of the sample data set
> below. The rule sets are the guidelines for determining an individual's
> date for retirement eligibility. The rules are found in this document,
> http://www.opm.gov/feddata/RetirementPaperFinal_v4.pdf. I am only
> interested in the top two categories for retirement eligibility, the
> CSRS and FERS plans.
>
> The data set has four variables Date of Birth (DOB), service computation
> date (srvCompDT), retirement plan (retirePlan), and the age at which the
> employee entered federal service (ageFedStart). The service computation
> date is used to compute the date eligible for retirement. The retirement
> plan indicates what system the employee is enrolled under.
>
> The data does contain a few other retirement plans, for now I want to
> just ignore those plans. I have labeled plans as 1-CSRS and 2-FERS, and
> 3-Other. My first attempt at applying the rules was through a complex
> nesting of ifelse statements, this was not very successful and quite
> difficult to follow. I then wrote a function and tried using "apply"
> unsuccessfully. The function is shown below.
>
> I would like to put a short script or function together that would allow
> for an efficient application of the rules to each of the employees. I am
> trying to avoid a loop, because my data set is quite large, and I may
> need to update my data set regularly and re-run the analysis and reports
> that will come from this work.
>
> Any advice or guidance on building the function or code to apply the
> rules would be quite helpful.
>
> retireHelp <-
> structure(list(DOB = structure(c(-6642, -5134, -3444, -5598,
> -4356, 5737, -4894, -1951, -2950, 2467, 6945, 4908, -7930, -7236,
> -7727, -77, 4158, -7892, -6028, -7132, -5959, 2309, -2494, -3513,
> -383, -216, -3369, -5861, 3674, -10265, -8986, -5023, -4862,
> 1526, -1022, 2175, -11790, -278, -7275, -5084, -1842, 430, -2220,
> -7444, 440, 4285, -7812, 3335, -7271, -6825, -1098, -1670, -10219,
> -7131, 5963, 704, -7662, 4219, -2813, 5147, -7334, -8223, -5922,
> -7497, -9276, -1291, -11640, -5631, 518, -7268, -2105, -5901,
> -690, -8146, -7059, 133, 1176, -6091, -2895, -6020, -4724, -3616,
> -5059, -8253, -2604, -12400, -4776, -3671, -9326, -7000, -5574,
> -3248, 4255, -1358, -6255, 8, -7115, -1701, -5227, 9, -517, -8674,
> -2554, -4069, -2077, -9872, -6534, 2970, -8307, -3020, -1343,
> -8897, -2304, -7424, 2078, -8274, -5559, -8888, -9262, -8473,
> -4088, -2429, -8006, -1091, 5015, 2765, 4036, 3101, -3743, 5103,
> -10018, -12095, -7646, -5966, -6208, -5784, -1325, -4288, -1665,
> -1409, 4685, -7881, -3413, 2738, -2201, 1217, -5113, 206, -1292,
> -1725, 10, -2978, -1895, -830, -105, -2395, -3496, -8244, -9956,
> -6494, -4678, -4077, 575, 2013, -3411, 3824, -4356, 4523, -5836,
> -6350, -5337, -41, -2001, -6632, -970, -6790, -2828, -4061, 476,
> 5854, -9648, -4227, 850, 2619, -7747, -2672, 4069, -12618, -6898,
> -4178, -1772, -1643, -2064, -157, 4551, -8688, -6087, -2040,
> -7239, -783), format = "m/d/y", origin = structure(c(1, 1, 1970
> ), .Names = c("month", "day", "year")), class = c("dates", "times"
> )), srvCompDT = structure(c(743, 12429, 3585, 4364, 13227, 13578,
> 13591, 8585, 9587, 13913, 14753, 13247, 2246, 1439, 8845, 7018,
> 12625, -552, 5688, 7080, 13255, 13549, 12709, 13969, 13997, 9532,
> 13689, 1226, 13549, 4093, 13423, 13801, 3181, 14809, 13353, 9457,
> 7745, 8986, 4759, 4486, 6449, 11172, 8669, 3344, 13745, 12275,
> 5081, 13605, 8006, 3048, 6330, 13521, 5254, 1733, 14095, 8516,
> 4848, 13521, 5970, 14697, 8291, 139, 11435, 3567, 8961, 5775,
> 3602, 1409, 11577, 12163, 12258, 13156, 9472, 7963, 1362, 10332,
> 9557, 3997, 7509, 4691, 3133, 5877, 6782, 11449, 13283, 8040,
> 11565, 3425, 7860, 1790, 10778, 13199, 12625, 5889, 3317, 9831,
> 1068, 8040, 7123, 9104, 12836, 7928, 12764, 8922, 5324, -1004,
> 1806, 10263, 5635, 10310, 5625, 8861, 14613, 3896, 10316, 5725,
> 12751, 6113, 2997, 112, 5707, 4987, -1018, 8055, 13885, 13073,
> 14585, 14865, 14935, 14390, 9735, 7654, 4557, 661, 1638, 1112,
> 14011, 3086, 7032, 13942, 13325, 6735, 13900, 12673, 10148, 14193,
> 14767, 8447, 6114, 10688, 13544, 7106, 8587, 14753, 7886, 12280,
> 11946, 13662, 3332, 2108, 13977, 6203, 8369, 13857, 8369, 11486,
> 8306, 12466, 12639, 7270, 4325, 13843, 14026, 14039, 6147, 7676,
> 5781, 7038, 9187, 14640, 6174, 11491, 13913, 13787, 13465, 8854,
> 13152, 1826, 1412, 4317, 5794, 5548, 8951, 12947, 12639, 5345,
> 5961, 4637, 6465, 13717), format = "m/d/y", origin = structure(c(1,
> 1, 1970), .Names = c("month", "day", "year")), class = c("dates",
> "times")), retirePlan = c(1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
> 1, 3, 2, 2, 2, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, 2, 1, 2, 2, 1,
> 2, 2, 2, 2, 3, 1, 2, 2, 2, 2, 1, 2, 2, 2, 2, 2, 1, 2, 2, 2, 1,
> 2, 2, 2, 2, 2, 2, 2, 1, 2, 2, 3, 2, 1, 1, 2, 2, 2, 2, 2, 2, 1,
> 3, 2, 1, 2, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 1, 2, 2, 2, 2, 1, 2,
> 1, 2, 2, 2, 2, 2, 2, 2, 2, 1, 2, 2, 2, 2, 2, 2, 2, 1, 2, 2, 2,
> 2, 1, 2, 2, 2, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, 1, 1, 2, 2,
> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
> 1, 2, 2, 2, 2, 2, 3, 2, 2, 2, 2, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2,
> 2, 2, 2, 2, 2, 2, 2, 2, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2),
> ageFedStart = c(20.22, 48.08, 19.24, 27.27, 48.14, 21.47,
> 50.61, 28.85, 34.32, 31.34, 21.38, 22.83, 27.86, 23.75, 45.37,
> 19.43, 23.18, 20.1, 32.08, 38.91, 52.61, 30.77, 41.62, 47.86,
> 39.37, 26.69, 46.7, 19.4, 27.04, 39.31, 61.35, 51.54, 22.02,
> 36.37, 39.36, 19.94, 53.48, 25.36, 32.95, 26.2, 22.7, 29.41,
> 29.81, 29.54, 36.43, 21.88, 35.3, 28.12, 41.83, 27.03, 20.34,
> 41.59, 42.36, 24.27, 22.26, 21.39, 34.25, 25.47, 24.05, 26.15,
> 42.78, 22.89, 47.52, 30.29, 49.93, 19.35, 41.73, 19.27, 30.28,
> 53.2, 39.32, 52.18, 27.82, 44.1, 23.06, 27.92, 22.95, 27.62,
> 28.48, 29.33, 21.51, 25.99, 32.42, 53.94, 43.5, 55.96, 44.74,
> 19.43, 47.05, 24.07, 44.77, 45.03, 22.92, 19.84, 26.21, 26.89,
> 22.4, 26.67, 33.81, 24.9, 36.56, 45.45, 41.94, 35.57, 20.26,
> 24.28, 22.83, 19.97, 38.17, 36.5, 19.08, 48.62, 46.32, 30.99,
> 22.55, 38.33, 50.13, 41.07, 33.56, 23.5, 26.82, 20.3, 19.13,
> 25.04, 24.28, 28.22, 28.88, 32.21, 51.14, 25.43, 54.08, 54.07,
> 33.41, 18.14, 21.48, 18.88, 41.99, 20.19, 23.81, 42.03, 23.66,
> 40.02, 47.4, 27.2, 33.81, 35.53, 54.43, 22.56, 20.28, 33.98,
> 37.05, 27.61, 28.7, 42.66, 21.88, 40.18, 42.28, 59.98, 36.38,
> 23.55, 51.07, 28.15, 21.34, 32.43, 32.25, 20.98, 34.67, 21.75,
> 50.58, 37.29, 26.45, 38.01, 43.88, 56.59, 19.49, 39.61, 23.57,
> 30.39, 23.85, 24.05, 43.32, 43.03, 35.76, 30.58, 58.08, 31.56,
> 24.87, 39.55, 22.75, 23.26, 20.71, 19.69, 30.16, 35.88, 22.14,
> 38.42, 32.99, 18.28, 37.52, 39.7)), .Names = c("DOB", "srvCompDT",
> "retirePlan", "ageFedStart"), row.names = c(NA, 200L), class =
> "data.frame")
>
> rrDT <- function(retSys, ageFedStart, birthDT, serviceCompDT){
> if(retSys == "CSRS") {
> if(ageFedStart < 25) rtDT <- dates(birthDT+(365.25*55))
> else if (ageFedStart >= 25 & ageFedStart < 30) rtDT <-
> dates(serviceCompDT+(365.25*30))
> else if (ageFedStart >= 30 & ageFedStart < 40) rtDT <-
> dates(birthDT+(365.25*60))
> else if (ageFedStart >= 40 & ageFedStart < 45) rtDT <-
> dates(serviceCompDT+(365.25*20))
> else if (ageFedStart >= 45 & ageFedStart < 60) rtDT <-
> dates(birthDT+(365.25*65))
> else if (ageFedStart >= 60) rtDT <-
> dates(serviceCompDT+(365.25*5))
> else rtDT <- NA
> }
> else if (retSys == "FERS") {
> if (birthDT < "01/01/53") {
> if(ageFedStart < 25) rtDT <- dates(birthDT+(365.25*55))
> else if (ageFedStart >= 25 & ageFedStart < 30) rtDT <-
> dates(serviceCompDT+(365.25*30))
> else if (ageFedStart >= 30 & ageFedStart < 40) rtDT <-
> dates(birthDT+(365.25*60))
> else if (ageFedStart >= 40 & ageFedStart < 42) rtDT <-
> dates(serviceCompDT+(365.25*20))
> else if (ageFedStart >= 42 & ageFedStart < 57) rtDT <-
> dates(birthDT+(365.25*62))
> else if (ageFedStart >= 57) rtDT <-
> dates(serviceCompDT+(365.25*5))
> else rtDT <- NA
> }
> else if (birthDT >= "01/01/53" & birthDT < "01/01/70") {
> if(ageFedStart < 26) rtDT <- dates(birthDT+(365.25*56))
> else if (ageFedStart >= 27 & ageFedStart < 30) rtDT <-
> dates(serviceCompDT+(365.25*30))
> else if (ageFedStart >= 30 & ageFedStart < 40) rtDT <-
> dates(birthDT+(365.25*60))
> else if (ageFedStart >= 40 & ageFedStart < 42) rtDT <-
> dates(serviceCompDT+(365.25*20))
> else if (ageFedStart >= 42 & ageFedStart < 57) rtDT <-
> dates(birthDT+(365.25*62))
> else if (ageFedStart >= 57) rtDT <-
> dates(serviceCompDT+(365.25*5))
> else rtDT <- NA
> }
> else if (birthDT >= "01/01/70"){
> if(ageFedStart < 27) rtDT <- dates(birthDT+(365.25*56))
> else if (ageFedStart >= 27 & ageFedStart < 30) rtDT <-
> dates(serviceCompDT+(365.25*30))
> else if (ageFedStart >= 30 & ageFedStart < 40) rtDT <-
> dates(birthDT+(365.25*60))
> else if (ageFedStart >= 40 & ageFedStart < 42) rtDT <-
> dates(serviceCompDT+(365.25*20))
> else if (ageFedStart >= 42 & ageFedStart < 57) rtDT <-
> dates(birthDT+(365.25*62))
> else if (ageFedStart >= 57) rtDT <-
> dates(serviceCompDT+(365.25*5))
> else rtDT <- NA
> }
> }
> else rtDT <- NA
> return(rtDT)
> }
>
> Adrian R. Katschke
> Data Analytics Specialist
> Human Capital Program Office
> Human Resources
> PH: 317-212-7813
> DSN: 699-7813
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
--
Bert Gunter
Genentech Nonclinical Biostatistics
More information about the R-help
mailing list