[R] recoding large number of categories (select in SAS)
james.holtman@convergys.com
james.holtman at convergys.com
Wed Jan 19 15:30:57 CET 2005
Here is a way of doing it by setting up a matrix of values to test against.
Easier than writing all the 'select' statements.
> x.trans <- matrix(c( # translation matrix; first column is min, second
is max,
+ 149, 150, 150, # and third is the value to be returned
+ 186, 187, 187,
+ 438, 438, 438,
+ 430, 430, 430,
+ 808, 826, 808,
+ 830, 832, 808,
+ 997, 998, 792,
+ 792, 796, 792), ncol=3, byrow=T)
> colnames(x.trans) <- c('min', 'max', 'value')
>
> x.default <- 9999 # default/nomatch value
>
> x.test <- c(150, 149, 148, 438, 997, 791, 795, 810, 820, 834) # test
data
> #
> # this function will test each value and if between the min/max, return 3
column
> #
> newValues <- sapply(x.test, function(x){
+ .value <- x.trans[(x >= x.trans[,'min']) & (x <=
x.trans[,'max']),'value']
+ if (length(.value) == 0) .value <- x.default # on no match, take
default
+ .value[1] # return first value if multiple matches
+ })
> newValues
[1] 150 150 9999 438 792 9999 792 808 808 9999
>
__________________________________________________________
James Holtman "What is the problem you are trying to solve?"
Executive Technical Consultant -- Office of Technology, Convergys
james.holtman at convergys.com
+1 (513) 723-2929
Denis Chabot
<chabotd at globetrotter To: r-help at stat.math.ethz.ch
.net> cc:
Sent by: Subject: [R] recoding large number of categories (select in SAS)
r-help-bounces at stat.m
ath.ethz.ch
01/19/2005 08:56 AM
Hi,
I have data on stomach contents. Possible prey species are in the
hundreds, so a list of prey codes has been in used in many labs doing
this kind of work.
When comes time to do analyses on these data one often wants to regroup
prey in broader categories, especially for rare prey.
In SAS you can nest a large number of "if-else", or do this more
cleanly with "select" like this:
select;
when (149 <= prey <=150) preyGr= 150;
when (186 <= prey <= 187) preyGr= 187;
when (prey= 438) preyGr= 438;
when (prey= 430) preyGr= 430;
when (prey= 436) preyGr= 436;
when (prey= 431) preyGr= 431;
when (prey= 451) preyGr= 451;
when (prey= 461) preyGr= 461;
when (prey= 478) preyGr= 478;
when (prey= 572) preyGr= 572;
when (692 <= prey <= 695 )
preyGr= 692;
when (808 <= prey <= 826, 830 <= prey <= 832 ) preyGr= 808;
when (997 <= prey <= 998, 792 <= prey <= 796) preyGr= 792;
when (882 <= prey <= 909)
preyGr= 882;
when (prey in (999, 125, 994))
preyGr= 9994;
otherwise preyGr= 1;
end; *select;
The number of transformations is usually much larger than this short
example.
What is the best way of doing this in R?
Sincerely,
Denis Chabot
______________________________________________
R-help at stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html
More information about the R-help
mailing list