[R] Need a more efficient way to implement this type of logic in R

Joshua Wiley jwiley.psych at gmail.com
Wed Apr 6 22:49:40 CEST 2011


Hi Walter,

Take a look at the function ?cut.  It is designed to take a continuous
variable and categorize it, and will be much simpler and faster.  The
only qualification is that your data would need to be numeric, not
character.  However, if your only values are the ones you put in
quotes in your code ('02' etc), a simple call to
as.numeric(variablename) ought to do the trick.  Beyond being faster,
you can probably get down to one line of code, which should be much
easier on the eyes.  To see some examples with cut(), type (at the
console):

example(cut)

Hope this helps,

Josh

P.S. If you are planning on doing any modelling with this data, why
not leave it continuous?

On Wed, Apr 6, 2011 at 1:02 PM, Walter Anderson <wandrson01 at gmail.com> wrote:
>  I have cobbled together the following logic.  It works but is very slow.
>  I'm sure that there must be a better r-specific way to implement this kind
> of thing, but have been unable to find/understand one.  Any help would be
> appreciated.
>
> hh.sub <- households[c("HOUSEID","HHFAMINC")]
> for (indx in 1:length(hh.sub$HOUSEID)) {
>  if ((hh.sub$HHFAMINC[indx] == '01') | (hh.sub$HHFAMINC[indx] == '02') |
> (hh.sub$HHFAMINC[indx] == '03') | (hh.sub$HHFAMINC[indx] == '04') |
> (hh.sub$HHFAMINC[indx] == '05'))
>    hh.sub$CS_FAMINC[indx] <- 1 # Less than $25,000
>  if ((hh.sub$HHFAMINC[indx] == '06') | (hh.sub$HHFAMINC[indx] == '07') |
> (hh.sub$HHFAMINC[indx] == '08') | (hh.sub$HHFAMINC[indx] == '09') |
> (hh.sub$HHFAMINC[indx] == '10'))
>    hh.sub$CS_FAMINC[indx] <- 2 # $25,000 to $50,000
>  if ((hh.sub$HHFAMINC[indx] == '11') | (hh.sub$HHFAMINC[indx] == '12') |
> (hh.sub$HHFAMINC[indx] == '13') | (hh.sub$HHFAMINC[indx] == '14') |
> (hh.sub$HHFAMINC[indx] == '15'))
>    hh.sub$CS_FAMINC[indx] <- 3 # $50,000 to $75,000
>  if ((hh.sub$HHFAMINC[indx] == '16') | (hh.sub$HHFAMINC[indx] == '17'))
>    hh.sub$CS_FAMINC[indx] <- 4 # $75,000 to $100,000
>  if ((hh.sub$HHFAMINC[indx] == '18'))
>    hh.sub$CS_FAMINC[indx] <- 5 # More than $100,000
>  if ((hh.sub$HHFAMINC[indx] == '-7') | (hh.sub$HHFAMINC[indx] == '-8') |
> (hh.sub$HHFAMINC[indx] == '-9'))
>    hh.sub$CS_FAMINC[indx] = 0
> }
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/



More information about the R-help mailing list