[R] max / pmax

Brian Perron beperron at wustl.edu
Tue May 30 19:40:06 CEST 2006


Hello R users,

I am relatively new to R and cannot seem to crack a coding problem.  I 
am working with substance abuse data, and I have a variable called 
"primary.drug" which is considered the drug of choice for each 
subject.   I have just a few missing values on that variable.  Instead 
of using a multiple imputation method like chained equations, I would 
prefer to derive these values from other survey responses.  
Specifically, I have a frequency of use (in days) for each of the major 
drugs, so I would like the missing values to be replaced by that drug 
with the highest level of use.  I am starting with the "ifelse" and 
"max" statements, but I know it is wrong:

impute.primary.drug <-   ifelse(is.na(primary.drug), max(marijuana, 
crack, cocaine, heroin), primary.drug)

Here are the problems.  First, the max statement (should it be "pmax"?), 
returns the highest numeric quantity rather than the variable itself.  
In other words, I want to test which drug has the highest value, but 
return the variable name rather than the observed value.   Second, if 
ties are observed, how can I specify the value to be NA?  Or, how can I 
specify one of the values to be randomly selected?   

 Thank in advance for your assistance.

Regards,
Brian



More information about the R-help mailing list