[R] applying a set of rules to each row

KATSCHKE, ADRIAN CIV DFAS ADRIAN.KATSCHKE at DFAS.MIL
Wed Jan 26 21:18:40 CET 2011


All,

I would like to apply a set of rules to each row of the sample data set
below. The rule sets are the guidelines for determining an individual's
date for retirement eligibility. The rules are found in this document,
http://www.opm.gov/feddata/RetirementPaperFinal_v4.pdf. I am only
interested in the top two categories for retirement eligibility, the
CSRS and FERS plans. 

The data set has four variables Date of Birth (DOB), service computation
date (srvCompDT), retirement plan (retirePlan), and the age at which the
employee entered federal service (ageFedStart). The service computation
date is used to compute the date eligible for retirement. The retirement
plan indicates what system the employee is enrolled under.

The data does contain a few other retirement plans, for now I want to
just ignore those plans. I have labeled plans as 1-CSRS and 2-FERS, and
3-Other. My first attempt at applying the rules was through a complex
nesting of ifelse statements, this was not very successful and quite
difficult to follow. I then wrote a function and tried using "apply"
unsuccessfully. The function is shown below.

I would like to put a short script or function together that would allow
for an efficient application of the rules to each of the employees. I am
trying to avoid a loop, because my data set is quite large, and I may
need to update my data set regularly and re-run the analysis and reports
that will come from this work.

Any advice or guidance on building the function or code to apply the
rules would be quite helpful.

retireHelp <-
structure(list(DOB = structure(c(-6642, -5134, -3444, -5598, 
-4356, 5737, -4894, -1951, -2950, 2467, 6945, 4908, -7930, -7236, 
-7727, -77, 4158, -7892, -6028, -7132, -5959, 2309, -2494, -3513, 
-383, -216, -3369, -5861, 3674, -10265, -8986, -5023, -4862, 
1526, -1022, 2175, -11790, -278, -7275, -5084, -1842, 430, -2220, 
-7444, 440, 4285, -7812, 3335, -7271, -6825, -1098, -1670, -10219, 
-7131, 5963, 704, -7662, 4219, -2813, 5147, -7334, -8223, -5922, 
-7497, -9276, -1291, -11640, -5631, 518, -7268, -2105, -5901, 
-690, -8146, -7059, 133, 1176, -6091, -2895, -6020, -4724, -3616, 
-5059, -8253, -2604, -12400, -4776, -3671, -9326, -7000, -5574, 
-3248, 4255, -1358, -6255, 8, -7115, -1701, -5227, 9, -517, -8674, 
-2554, -4069, -2077, -9872, -6534, 2970, -8307, -3020, -1343, 
-8897, -2304, -7424, 2078, -8274, -5559, -8888, -9262, -8473, 
-4088, -2429, -8006, -1091, 5015, 2765, 4036, 3101, -3743, 5103, 
-10018, -12095, -7646, -5966, -6208, -5784, -1325, -4288, -1665, 
-1409, 4685, -7881, -3413, 2738, -2201, 1217, -5113, 206, -1292, 
-1725, 10, -2978, -1895, -830, -105, -2395, -3496, -8244, -9956, 
-6494, -4678, -4077, 575, 2013, -3411, 3824, -4356, 4523, -5836, 
-6350, -5337, -41, -2001, -6632, -970, -6790, -2828, -4061, 476, 
5854, -9648, -4227, 850, 2619, -7747, -2672, 4069, -12618, -6898, 
-4178, -1772, -1643, -2064, -157, 4551, -8688, -6087, -2040, 
-7239, -783), format = "m/d/y", origin = structure(c(1, 1, 1970
), .Names = c("month", "day", "year")), class = c("dates", "times"
)), srvCompDT = structure(c(743, 12429, 3585, 4364, 13227, 13578, 
13591, 8585, 9587, 13913, 14753, 13247, 2246, 1439, 8845, 7018, 
12625, -552, 5688, 7080, 13255, 13549, 12709, 13969, 13997, 9532, 
13689, 1226, 13549, 4093, 13423, 13801, 3181, 14809, 13353, 9457, 
7745, 8986, 4759, 4486, 6449, 11172, 8669, 3344, 13745, 12275, 
5081, 13605, 8006, 3048, 6330, 13521, 5254, 1733, 14095, 8516, 
4848, 13521, 5970, 14697, 8291, 139, 11435, 3567, 8961, 5775, 
3602, 1409, 11577, 12163, 12258, 13156, 9472, 7963, 1362, 10332, 
9557, 3997, 7509, 4691, 3133, 5877, 6782, 11449, 13283, 8040, 
11565, 3425, 7860, 1790, 10778, 13199, 12625, 5889, 3317, 9831, 
1068, 8040, 7123, 9104, 12836, 7928, 12764, 8922, 5324, -1004, 
1806, 10263, 5635, 10310, 5625, 8861, 14613, 3896, 10316, 5725, 
12751, 6113, 2997, 112, 5707, 4987, -1018, 8055, 13885, 13073, 
14585, 14865, 14935, 14390, 9735, 7654, 4557, 661, 1638, 1112, 
14011, 3086, 7032, 13942, 13325, 6735, 13900, 12673, 10148, 14193, 
14767, 8447, 6114, 10688, 13544, 7106, 8587, 14753, 7886, 12280, 
11946, 13662, 3332, 2108, 13977, 6203, 8369, 13857, 8369, 11486, 
8306, 12466, 12639, 7270, 4325, 13843, 14026, 14039, 6147, 7676, 
5781, 7038, 9187, 14640, 6174, 11491, 13913, 13787, 13465, 8854, 
13152, 1826, 1412, 4317, 5794, 5548, 8951, 12947, 12639, 5345, 
5961, 4637, 6465, 13717), format = "m/d/y", origin = structure(c(1, 
1, 1970), .Names = c("month", "day", "year")), class = c("dates", 
"times")), retirePlan = c(1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 
1, 3, 2, 2, 2, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, 2, 1, 2, 2, 1, 
2, 2, 2, 2, 3, 1, 2, 2, 2, 2, 1, 2, 2, 2, 2, 2, 1, 2, 2, 2, 1, 
2, 2, 2, 2, 2, 2, 2, 1, 2, 2, 3, 2, 1, 1, 2, 2, 2, 2, 2, 2, 1, 
3, 2, 1, 2, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 1, 2, 2, 2, 2, 1, 2, 
1, 2, 2, 2, 2, 2, 2, 2, 2, 1, 2, 2, 2, 2, 2, 2, 2, 1, 2, 2, 2, 
2, 1, 2, 2, 2, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, 1, 1, 2, 2, 
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 
1, 2, 2, 2, 2, 2, 3, 2, 2, 2, 2, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 
2, 2, 2, 2, 2, 2, 2, 2, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2), 
    ageFedStart = c(20.22, 48.08, 19.24, 27.27, 48.14, 21.47, 
    50.61, 28.85, 34.32, 31.34, 21.38, 22.83, 27.86, 23.75, 45.37, 
    19.43, 23.18, 20.1, 32.08, 38.91, 52.61, 30.77, 41.62, 47.86, 
    39.37, 26.69, 46.7, 19.4, 27.04, 39.31, 61.35, 51.54, 22.02, 
    36.37, 39.36, 19.94, 53.48, 25.36, 32.95, 26.2, 22.7, 29.41, 
    29.81, 29.54, 36.43, 21.88, 35.3, 28.12, 41.83, 27.03, 20.34, 
    41.59, 42.36, 24.27, 22.26, 21.39, 34.25, 25.47, 24.05, 26.15, 
    42.78, 22.89, 47.52, 30.29, 49.93, 19.35, 41.73, 19.27, 30.28, 
    53.2, 39.32, 52.18, 27.82, 44.1, 23.06, 27.92, 22.95, 27.62, 
    28.48, 29.33, 21.51, 25.99, 32.42, 53.94, 43.5, 55.96, 44.74, 
    19.43, 47.05, 24.07, 44.77, 45.03, 22.92, 19.84, 26.21, 26.89, 
    22.4, 26.67, 33.81, 24.9, 36.56, 45.45, 41.94, 35.57, 20.26, 
    24.28, 22.83, 19.97, 38.17, 36.5, 19.08, 48.62, 46.32, 30.99, 
    22.55, 38.33, 50.13, 41.07, 33.56, 23.5, 26.82, 20.3, 19.13, 
    25.04, 24.28, 28.22, 28.88, 32.21, 51.14, 25.43, 54.08, 54.07, 
    33.41, 18.14, 21.48, 18.88, 41.99, 20.19, 23.81, 42.03, 23.66, 
    40.02, 47.4, 27.2, 33.81, 35.53, 54.43, 22.56, 20.28, 33.98, 
    37.05, 27.61, 28.7, 42.66, 21.88, 40.18, 42.28, 59.98, 36.38, 
    23.55, 51.07, 28.15, 21.34, 32.43, 32.25, 20.98, 34.67, 21.75, 
    50.58, 37.29, 26.45, 38.01, 43.88, 56.59, 19.49, 39.61, 23.57, 
    30.39, 23.85, 24.05, 43.32, 43.03, 35.76, 30.58, 58.08, 31.56, 
    24.87, 39.55, 22.75, 23.26, 20.71, 19.69, 30.16, 35.88, 22.14, 
    38.42, 32.99, 18.28, 37.52, 39.7)), .Names = c("DOB", "srvCompDT", 
"retirePlan", "ageFedStart"), row.names = c(NA, 200L), class =
"data.frame")

rrDT <- function(retSys, ageFedStart, birthDT, serviceCompDT){
    if(retSys == "CSRS") {
        if(ageFedStart < 25) rtDT <- dates(birthDT+(365.25*55))
        else if (ageFedStart >= 25 & ageFedStart < 30) rtDT <-
dates(serviceCompDT+(365.25*30))
        else if (ageFedStart >= 30 & ageFedStart < 40) rtDT <-
dates(birthDT+(365.25*60))
        else if (ageFedStart >= 40 & ageFedStart < 45) rtDT <-
dates(serviceCompDT+(365.25*20))
        else if (ageFedStart >= 45 & ageFedStart < 60) rtDT <-
dates(birthDT+(365.25*65))
        else if (ageFedStart >= 60) rtDT <-
dates(serviceCompDT+(365.25*5))
        else rtDT <- NA
    }
    else if (retSys == "FERS") {
        if (birthDT < "01/01/53") {
            if(ageFedStart < 25) rtDT <- dates(birthDT+(365.25*55))
            else if (ageFedStart >= 25 & ageFedStart < 30) rtDT <-
dates(serviceCompDT+(365.25*30))
            else if (ageFedStart >= 30 & ageFedStart < 40) rtDT <-
dates(birthDT+(365.25*60))
            else if (ageFedStart >= 40 & ageFedStart < 42) rtDT <-
dates(serviceCompDT+(365.25*20))
            else if (ageFedStart >= 42 & ageFedStart < 57) rtDT <-
dates(birthDT+(365.25*62))
            else if (ageFedStart >= 57) rtDT <-
dates(serviceCompDT+(365.25*5))
            else rtDT <- NA
        }
        else if (birthDT >= "01/01/53" & birthDT < "01/01/70") {
            if(ageFedStart < 26) rtDT <- dates(birthDT+(365.25*56))
            else if (ageFedStart >= 27 & ageFedStart < 30) rtDT <-
dates(serviceCompDT+(365.25*30))
            else if (ageFedStart >= 30 & ageFedStart < 40) rtDT <-
dates(birthDT+(365.25*60))
            else if (ageFedStart >= 40 & ageFedStart < 42) rtDT <-
dates(serviceCompDT+(365.25*20))
            else if (ageFedStart >= 42 & ageFedStart < 57) rtDT <-
dates(birthDT+(365.25*62))
            else if (ageFedStart >= 57) rtDT <-
dates(serviceCompDT+(365.25*5))
            else rtDT <- NA
        }
        else if (birthDT >= "01/01/70"){
            if(ageFedStart < 27) rtDT <- dates(birthDT+(365.25*56))
            else if (ageFedStart >= 27 & ageFedStart < 30) rtDT <-
dates(serviceCompDT+(365.25*30))
            else if (ageFedStart >= 30 & ageFedStart < 40) rtDT <-
dates(birthDT+(365.25*60))
            else if (ageFedStart >= 40 & ageFedStart < 42) rtDT <-
dates(serviceCompDT+(365.25*20))
            else if (ageFedStart >= 42 & ageFedStart < 57) rtDT <-
dates(birthDT+(365.25*62))
            else if (ageFedStart >= 57) rtDT <-
dates(serviceCompDT+(365.25*5))
            else rtDT <- NA
        }
    }
    else rtDT <- NA
    return(rtDT)
}

Adrian R. Katschke
Data Analytics Specialist
Human Capital Program Office
Human Resources
PH: 317-212-7813 
DSN: 699-7813



More information about the R-help mailing list