[R] Simple programming question

Gabor Grothendieck ggrothendieck at gmail.com
Fri May 18 17:12:52 CEST 2007


The solution already calculates it as numeric and only after that
does it convert it to factor so just omit the conversion:

f <- function(x) 4 - pmin(3, match(x, sort(unique(x), decreasing = TRUE)))
score <- ave(dfr$var3, dfr$categ, FUN = f)

As mentioned, this assigns 1 to low (everything other than the highest
two numbers in a category), 2 to the second highest and 3 to the highest.

If you want some other assignment, e.g. 3 is low, 1 is mid and 0 is high
then try:

c(3, 1, 0)[score]

On 5/18/07, Lauri Nikkinen <lauri.nikkinen at iki.fi> wrote:
> Thank you all for your answers. Actually Gabor's first post was right in
> that sense that I wanted to have "low" to all cases which are lower than
> second highest. But how about if I want to convert/recode those "high",
> "mid" and "low" to numeric to make some calculations, e.g. 3, 1, 0
> respectively. How do I have to modify your solutions? I would also like to
> apply this solution to many kinds of recoding situations.
>
> -Lauri
>
>
> 2007/5/18, Gabor Grothendieck <ggrothendieck at gmail.com>:
> > There was a problem in the first line in the case that the highest number
> > is not unique within a category.   In this example its not apparent since
> > that never occurs.  At any rate, it should be:
> >
> > f <- function(x) 4 - pmin(3, match(x, sort(unique(x), decreasing = TRUE)))
> > factor(ave(dfr$var3, dfr$categ, FUN = f), lab = c("low", "mid", "high"))
> >
> > Also note that the factor labels were arranged so that
> > "low", "mid" and "high" correspond to levels 1, 2 and 3
> > respectively.
> >
> > On 5/18/07, Gabor Grothendieck < ggrothendieck at gmail.com> wrote:
> > > Try this.  f assigns 1, 2 and 3 to the highest, second highest and third
> highest
> > > within a category.  ave applies f to each category.  Finally we convert
> it to a
> > > factor.
> > >
> > > f <- function(x) 4 - pmin(3, match(x, sort(x, decreasing = TRUE)))
> > > factor(ave(dfr$var3, dfr$categ, FUN = f), lab = c("low", "mid", "high"))
> > >
> > >
> > >
> > > On 5/18/07, Lauri Nikkinen <lauri.nikkinen at iki.fi> wrote:
> > > > Hi R-users,
> > > >
> > > > I have a simple question for R heavy users. If I have a data frame
> like this
> > > >
> > > >
> > > > dfr <- data.frame(id=1:16, categ=rep(LETTERS[1:4], 4),
> > > > var3=c(8,7,6,6,5,4,5,4,3,4,3,2,3,2,1,1))
> > > > dfr <- dfr[order(dfr$categ),]
> > > >
> > > > and I want to score values or points in variable named "var3"
> following this
> > > > kind of logic:
> > > >
> > > > 1. the highest value of var3 within category (variable named "categ")
> ->
> > > > "high"
> > > > 2. the second highest value -> "mid"
> > > > 3. lowest value -> "low"
> > > >
> > > > This would be the output of this reasoning:
> > > >
> > > > dfr$score <-
> > > >
> factor(c("high","mid","low","low","high","mid","mid","low","high","mid","low","low","high","mid","low","low"))
> > > > dfr
> > > >
> > > > The question is how I do this programmatically in R (i.e. if I have
> 2000
> > > > rows in my dfr)?
> > > >
> > > > I appreciate your help!
> > > >
> > > > Cheers,
> > > > Lauri
> > > >
> > > >        [[alternative HTML version deleted]]
> > > >
> > > > ______________________________________________
> > > > R-help at stat.math.ethz.ch mailing list
> > > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > > > and provide commented, minimal, self-contained, reproducible code.
> > > >
> > >
> >
>
>



More information about the R-help mailing list