[R] max / pmax
Brian Perron
beperron at wustl.edu
Tue May 30 19:40:06 CEST 2006
Hello R users,
I am relatively new to R and cannot seem to crack a coding problem. I
am working with substance abuse data, and I have a variable called
"primary.drug" which is considered the drug of choice for each
subject. I have just a few missing values on that variable. Instead
of using a multiple imputation method like chained equations, I would
prefer to derive these values from other survey responses.
Specifically, I have a frequency of use (in days) for each of the major
drugs, so I would like the missing values to be replaced by that drug
with the highest level of use. I am starting with the "ifelse" and
"max" statements, but I know it is wrong:
impute.primary.drug <- ifelse(is.na(primary.drug), max(marijuana,
crack, cocaine, heroin), primary.drug)
Here are the problems. First, the max statement (should it be "pmax"?),
returns the highest numeric quantity rather than the variable itself.
In other words, I want to test which drug has the highest value, but
return the variable name rather than the observed value. Second, if
ties are observed, how can I specify the value to be NA? Or, how can I
specify one of the values to be randomly selected?
Thank in advance for your assistance.
Regards,
Brian
More information about the R-help
mailing list