[R] Simple programming question
Adaikalavan Ramasamy
ramasamy at cancer.org.uk
Fri May 18 16:23:24 CEST 2007
According to your post you are assuming that there are only 3 unique
values for var3 within each category. But category C and D have 4 unique
values for var3.
split(dfr, dfr$categ)
...
$C
id categ var3 score
3 3 C 6 high
7 7 C 5 mid
11 11 C 3 low
15 15 C 1 low
...
If you meant something different, then just change myfun() below
gmax <- function(x, rnk=1){
## generalized maximum with rnk=1 being the bigest value (i.e. max)
return( sort( unique(x), decreasing=T )[rnk] )
}
myfun <- function(x){ ifelse( x==gmax(x,1), "high",
ifelse( x==gmax(x,2), "med", "low" ) ) }
out <- lapply( split(dfr$var3, dfr$categ), myfun )
data.frame( dfr, my.score = unsplit(out, dfr$categ) )
Regards, Adai
Lauri Nikkinen wrote:
> Hi R-users,
>
> I have a simple question for R heavy users. If I have a data frame like this
>
>
> dfr <- data.frame(id=1:16, categ=rep(LETTERS[1:4], 4),
> var3=c(8,7,6,6,5,4,5,4,3,4,3,2,3,2,1,1))
> dfr <- dfr[order(dfr$categ),]
>
> and I want to score values or points in variable named "var3" following this
> kind of logic:
>
> 1. the highest value of var3 within category (variable named "categ") ->
> "high"
> 2. the second highest value -> "mid"
> 3. lowest value -> "low"
>
> This would be the output of this reasoning:
>
> dfr$score <-
> factor(c("high","mid","low","low","high","mid","mid","low","high","mid","low","low","high","mid","low","low"))
> dfr
>
> The question is how I do this programmatically in R (i.e. if I have 2000
> rows in my dfr)?
>
> I appreciate your help!
>
> Cheers,
> Lauri
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
>
More information about the R-help
mailing list