# [R] max / pmax

Tony Plate tplate at acm.org
Tue May 30 21:24:16 CEST 2006

```Here's an example of how I think you can do what you want.  Play with
the definition of the function highest.use() to get random selection of
multiple maxima.

> drug.names <- c("marijuana", "crack", "cocaine", "heroin")
> drugs <- factor(drug.names, levels=drug.names)
> drugs
[1] marijuana crack     cocaine   heroin
Levels: marijuana crack cocaine heroin
> as.numeric(drugs)
[1] 1 2 3 4
> N <- 20
> set.seed(1)
> primary.drug <- sample(drugs, N, rep=T)
> primary.drug[sample(1:20, 10)] <- NA
> primary.drug
[1] <NA>      crack     <NA>      <NA>      <NA>      <NA>      heroin
[8] cocaine   cocaine   marijuana <NA>      <NA>      cocaine   crack
[15] heroin    <NA>      cocaine   heroin    <NA>      <NA>
Levels: marijuana crack cocaine heroin
> # usage frequencies
> marijuana <- sample(1:3, N, rep=T)
> crack <- sample(1:3, N, rep=T)
> cocaine <- sample(1:3, N, rep=T)
> heroin <- sample(1:3, N, rep=T)
> cbind(marijuana, crack, cocaine, heroin)
marijuana crack cocaine heroin
[1,]         2     2       2      1
[2,]         2     3       3      1
[3,]         2     2       2      2
[4,]         1     1       2      3
[5,]         3     1       2      3
[6,]         3     1       3      3
[7,]         3     1       3      2
[8,]         1     2       2      2
[9,]         3     2       3      3
[10,]         2     2       3      2
[11,]         3     3       2      2
[12,]         2     1       3      2
[13,]         3     2       2      1
[14,]         2     1       1      3
[15,]         2     2       3      2
[16,]         3     1       1      1
[17,]         1     2       3      1
[18,]         2     3       1      2
[19,]         3     1       1      3
[20,]         3     3       1      2
> highest.use <- function(x) {y <- which(x==max(x, na.rm=T)); if
(length(y)==1) return(y) else return(NA)}
> apply(cbind(marijuana, crack, cocaine, heroin), 1, highest.use)
[1] NA NA NA  4 NA NA NA NA NA  3 NA  3  1  4  3  1  3  2 NA NA
> impute.primary.drug <- drugs[ifelse(is.na(primary.drug),
apply(cbind(marijuana, crack, cocaine, heroin), 1, highest.use),
as.numeric(primary.drug))]
> data.frame(primary.drug, impute.primary.drug)
primary.drug impute.primary.drug
1          <NA>                <NA>
2         crack               crack
3          <NA>                <NA>
4          <NA>              heroin
5          <NA>                <NA>
6          <NA>                <NA>
7        heroin              heroin
8       cocaine             cocaine
9       cocaine             cocaine
10    marijuana           marijuana
11         <NA>                <NA>
12         <NA>             cocaine
13      cocaine             cocaine
14        crack               crack
15       heroin              heroin
16         <NA>           marijuana
17      cocaine             cocaine
18       heroin              heroin
19         <NA>                <NA>
20         <NA>                <NA>
>

Brian Perron wrote:
> Hello R users,
>
> I am relatively new to R and cannot seem to crack a coding problem.  I
> am working with substance abuse data, and I have a variable called
> "primary.drug" which is considered the drug of choice for each
> subject.   I have just a few missing values on that variable.  Instead
> of using a multiple imputation method like chained equations, I would
> prefer to derive these values from other survey responses.
> Specifically, I have a frequency of use (in days) for each of the major
> drugs, so I would like the missing values to be replaced by that drug
> with the highest level of use.  I am starting with the "ifelse" and
> "max" statements, but I know it is wrong:
>
> impute.primary.drug <-   ifelse(is.na(primary.drug), max(marijuana,
> crack, cocaine, heroin), primary.drug)
>
> Here are the problems.  First, the max statement (should it be "pmax"?),
> returns the highest numeric quantity rather than the variable itself.
> In other words, I want to test which drug has the highest value, but
> return the variable name rather than the observed value.   Second, if
> ties are observed, how can I specify the value to be NA?  Or, how can I
> specify one of the values to be randomly selected?
>
>
> Regards,
> Brian
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help