[R] Simple programming question
Bert Gunter
gunter.berton at gene.com
Fri May 18 17:53:26 CEST 2007
?cut
This would recode to a factor with numeric labels for its levels.
as.numeric(as.character(...))would then convert the labels to numeric values
that you can manipulate. This presumes that the variable you are coding is
numeric and you want to recode by binning the values into ordered bins.
Bert Gunter
Genentech Nonclinical Statistics
-----Original Message-----
From: r-help-bounces at stat.math.ethz.ch
[mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Lauri Nikkinen
Sent: Friday, May 18, 2007 8:02 AM
To: Gabor Grothendieck
Cc: r-help at stat.math.ethz.ch
Subject: Re: [R] Simple programming question
Thank you all for your answers. Actually Gabor's first post was right in
that sense that I wanted to have "low" to all cases which are lower than
second highest. But how about if I want to convert/recode those "high",
"mid" and "low" to numeric to make some calculations, e.g. 3, 1, 0
respectively. How do I have to modify your solutions? I would also like to
apply this solution to many kinds of recoding situations.
-Lauri
2007/5/18, Gabor Grothendieck <ggrothendieck at gmail.com>:
>
> There was a problem in the first line in the case that the highest number
> is not unique within a category. In this example its not apparent since
> that never occurs. At any rate, it should be:
>
> f <- function(x) 4 - pmin(3, match(x, sort(unique(x), decreasing = TRUE)))
> factor(ave(dfr$var3, dfr$categ, FUN = f), lab = c("low", "mid", "high"))
>
> Also note that the factor labels were arranged so that
> "low", "mid" and "high" correspond to levels 1, 2 and 3
> respectively.
>
> On 5/18/07, Gabor Grothendieck <ggrothendieck at gmail.com> wrote:
> > Try this. f assigns 1, 2 and 3 to the highest, second highest and third
> highest
> > within a category. ave applies f to each category. Finally we convert
> it to a
> > factor.
> >
> > f <- function(x) 4 - pmin(3, match(x, sort(x, decreasing = TRUE)))
> > factor(ave(dfr$var3, dfr$categ, FUN = f), lab = c("low", "mid", "high"))
> >
> >
> >
> > On 5/18/07, Lauri Nikkinen <lauri.nikkinen at iki.fi> wrote:
> > > Hi R-users,
> > >
> > > I have a simple question for R heavy users. If I have a data frame
> like this
> > >
> > >
> > > dfr <- data.frame(id=1:16, categ=rep(LETTERS[1:4], 4),
> > > var3=c(8,7,6,6,5,4,5,4,3,4,3,2,3,2,1,1))
> > > dfr <- dfr[order(dfr$categ),]
> > >
> > > and I want to score values or points in variable named "var3"
> following this
> > > kind of logic:
> > >
> > > 1. the highest value of var3 within category (variable named "categ")
> ->
> > > "high"
> > > 2. the second highest value -> "mid"
> > > 3. lowest value -> "low"
> > >
> > > This would be the output of this reasoning:
> > >
> > > dfr$score <-
> > >
>
factor(c("high","mid","low","low","high","mid","mid","low","high","mid","low
","low","high","mid","low","low"))
> > > dfr
> > >
> > > The question is how I do this programmatically in R (i.e. if I have
> 2000
> > > rows in my dfr)?
> > >
> > > I appreciate your help!
> > >
> > > Cheers,
> > > Lauri
> > >
> > > [[alternative HTML version deleted]]
> > >
> > > ______________________________________________
> > > R-help at stat.math.ethz.ch mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> > >
> >
>
[[alternative HTML version deleted]]
______________________________________________
R-help at stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list