[R] recoding large number of categories (select in SAS)
John Fox
jfox at mcmaster.ca
Wed Jan 19 17:09:51 CET 2005
Dear Peter et al.,
The recode() function in the car package will also do this kind of thing,
will work even when the ranges include non-integers, and supports an else=
construction.
Regards,
John
--------------------------------
John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox
--------------------------------
> -----Original Message-----
> From: r-help-bounces at stat.math.ethz.ch
> [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Peter Dalgaard
> Sent: Wednesday, January 19, 2005 10:52 AM
> To: Philippe Grosjean
> Cc: r-help at stat.math.ethz.ch; Denis Chabot
> Subject: Re: [R] recoding large number of categories (select in SAS)
>
> Philippe Grosjean <phgrosjean at sciviews.org> writes:
>
> > Does
> >
> > > ?cut
> >
> > answers to your question?
>
> That's one way, but it tends to get messy to get the names right.
>
> You might consider using the rather little-known variant of levels
> assignment:
>
> preyGR <- prey # or factor(prey) if it wasn't one already
> levels(preyGR) <- list("150"=149:150,
> "187"=187:188,
> "438"=438,
>
> [...]
>
> "9994"=c(999,125,994), "1"=NA)
>
> preyGR[is.na(preyGR) & !is.na(prey)] <- "1"
>
> This would be roughly as clean as the SAS way, only the "otherwise"
> case got a bit tricky.
>
> > Best,
> >
> > Philippe Grosjean
> >
> > ..............................................<°}))><........
> > ) ) ) ) )
> > ( ( ( ( ( Prof. Philippe Grosjean
> > ) ) ) ) )
> > ( ( ( ( ( Numerical Ecology of Aquatic Systems
> > ) ) ) ) ) Mons-Hainaut University, Pentagone (3D08)
> > ( ( ( ( ( Academie Universitaire Wallonie-Bruxelles
> > ) ) ) ) ) 8, av du Champ de Mars, 7000 Mons, Belgium
> > ( ( ( ( (
> > ) ) ) ) ) phone: + 32.65.37.34.97, fax: + 32.65.37.30.54
> > ( ( ( ( ( email: Philippe.Grosjean at umh.ac.be
> > ) ) ) ) )
> > ( ( ( ( ( web: http://www.umh.ac.be/~econum
> > ) ) ) ) ) http://www.sciviews.org
> > ( ( ( ( (
> > ..............................................................
> >
> > Denis Chabot wrote:
> > > Hi,
> > > I have data on stomach contents. Possible prey species are in the
> > > hundreds, so a list of prey codes has been in used in many labs
> > > doing this kind of work.
> > > When comes time to do analyses on these data one often wants to
> > > regroup prey in broader categories, especially for rare prey.
> > > In SAS you can nest a large number of "if-else", or do this more
> > > cleanly with "select" like this:
> > > select;
> > > when (149 <= prey <=150) preyGr= 150;
> > > when (186 <= prey <= 187) preyGr= 187;
> > > when (prey= 438) preyGr= 438;
> > > when (prey= 430) preyGr= 430;
> > > when (prey= 436) preyGr= 436;
> > > when (prey= 431) preyGr= 431;
> > > when (prey= 451) preyGr= 451;
> > > when (prey= 461) preyGr= 461;
> > > when (prey= 478) preyGr= 478;
> > > when (prey= 572) preyGr= 572;
> > > when (692 <= prey <= 695 )
> > > preyGr= 692;
> > > when (808 <= prey <= 826, 830 <= prey <= 832 )
> preyGr= 808;
> > > when (997 <= prey <= 998, 792 <= prey <= 796) preyGr= 792;
> > > when (882 <= prey <= 909) preyGr= 882;
> > > when (prey in (999, 125, 994))
> preyGr= 9994;
> > > otherwise preyGr= 1;
> > > end; *select;
> > > The number of transformations is usually much larger than
> this short
> > > example.
> > > What is the best way of doing this in R?
> > > Sincerely,
> > > Denis Chabot
> > > ______________________________________________
> > > R-help at stat.math.ethz.ch mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide!
> > > http://www.R-project.org/posting-guide.html
> > >
> >
> > ______________________________________________
> > R-help at stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide!
> > http://www.R-project.org/posting-guide.html
> >
>
> --
> O__ ---- Peter Dalgaard Blegdamsvej 3
> c/ /'_ --- Dept. of Biostatistics 2200 Cph. N
> (*) \(*) -- University of Copenhagen Denmark Ph:
> (+45) 35327918
> ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX:
> (+45) 35327907
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html
More information about the R-help
mailing list