[R] recode: how to avoid nested ifelse

Joshua Wiley jwiley.psych at gmail.com
Sat Jun 8 03:24:50 CEST 2013


Hi Paul,

Unless you have truly offended the data generating oracle*, the
pattern: NA, 1, NA, should be a data entry error --- graduating HS
implies graduating ES, no?  I would argue fringe cases like that
should be corrected in the data, not through coding work arounds.
Then you can just do:

x <- do.call(paste0, list(es, hs, cg))

> table(factor(x, levels = c("000", "100", "110", "111"), labels = c("none", "es","hs", "cg")))
none   es   hs   cg
   4    1    1    2

Cheers,

Josh

*Drawn from comments by Judea Pearl one lively session.


On Fri, Jun 7, 2013 at 6:13 PM, Paul Johnson <pauljohn32 at gmail.com> wrote:
> In our Summer Stats Institute, I was asked a question that amounts to
> reversing the effect of the contrasts function (reconstruct an ordinal
> predictor from a set of binary columns). The best I could think of was to
> link together several ifelse functions, and I don't think I want to do this
> if the example became any more complicated.
>
> I'm unable to remember a less error prone method :). But I expect you might.
>
> Here's my working example code
>
> ## Paul Johnson <pauljohn at ku.edu>
> ## 2013-06-07
>
> ## We need to create an ordinal factor from these indicators
> ## completed elementary school
> es <- c(0, 0, 1, 0, 1, 0, 1, 1)
> ## completed high school
> hs <- c(0, 0, 1, 0, 1, 0, 1, 0)
> ## completed college graduate
> cg <- c(0, 0, 0, 0, 1, 0, 1, 0)
>
> ed <- ifelse(cg == 1, 3,
>              ifelse(hs == 1, 2,
>                     ifelse(es == 1, 1, 0)))
>
> edf <- factor(ed, levels = 0:3,  labels = c("none", "es", "hs", "cg"))
> data.frame(es, hs, cg, ed, edf)
>
> ## Looks OK, but what if there are missings?
> es <- c(0, 0, 1, 0, 1, 0, 1, 1, NA, NA)
> hs <- c(0, 0, 1, 0, 1, 0, 1, 0, 1, NA)
> cg <- c(0, 0, 0, 0, 1, 0, 1, 0, NA, NA)
> ed <- ifelse(cg == 1, 3,
>              ifelse(hs == 1, 2,
>                     ifelse(es == 1, 1, 0)))
> cbind(es, hs, cg, ed)
>
> ## That's bad, ifelse returns NA too frequently.
> ## Revise (becoming tedious!)
>
> ed <- ifelse(!is.na(cg) & cg == 1, 3,
>              ifelse(!is.na(hs) & hs == 1, 2,
>                     ifelse(!is.na(es) & es == 1, 1,
>                            ifelse(is.na(es), NA, 0))))
> cbind(es, hs, cg, ed)
>
>
> ## Does the project director want us to worry about
> ## logical inconsistencies, such as es = 0 but cg = 1?
> ## I hope not.
>
> Thanks in advance, I hope you are having a nice summer.
>
> pj
>
> --
> Paul E. Johnson
> Professor, Political Science      Assoc. Director
> 1541 Lilac Lane, Room 504      Center for Research Methods
> University of Kansas                 University of Kansas
> http://pj.freefaculty.org               http://quant.ku.edu
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://joshuawiley.com/
Senior Analyst - Elkhart Group Ltd.
http://elkhartgroup.com



More information about the R-help mailing list