[R] Multiple Imputation in mice/norm

Frank E Harrell Jr f.harrell at vanderbilt.edu
Sat Apr 25 15:25:19 CEST 2009


Emmanuel Charpentier wrote:
> Le vendredi 24 avril 2009 à 14:11 -0700, ToddPW a écrit :
>> I'm trying to use either mice or norm to perform multiple imputation to fill
>> in some missing values in my data.  The data has some missing values because
>> of a chemical detection limit (so they are left censored).  I'd like to use
>> MI because I have several variables that are highly correlated.  In SAS's
>> proc MI, there is an option with which you can limit the imputed values that
>> are returned to some range of specified values.  Is there a way to limit the
>> values in mice?  
> 
> You may do that by writing your own imputation function and assign them
> for the imputation of particular variable (see argument
> "imputationMethod" and details in the man page for "mice").
> 
>>                  If not, is there another MI tool in R that will allow me to
>> specify a range of acceptable values for my imputed data?
> 
> In the function amelia (package "Amelia"), you might specify a "bounds"
> argument, which allows for such a limitation. However, be aware that
> this might destroy the basic assumption of Amelia, which is that your
> data are multivariate normal. Maybe a change of variable is in order (e.
> g. log(concentration) has usually much better statistical properties
> than concentration).
> 
> Frank Harrell's aregImpute (package Hmisc) has the "curtail" argument
> (TRUE by default) which limits imputations to the range of observed
> values.
> 
> But if your left-censored variables are your dependent variables (not
> covariates), may I suggest to analyze these data as censored data, as
> allowed by Terry Therneau's "coxph" function (package "survival") ? code
> your "missing" data as such a variable (use :
> coxph(Surv(min(x,<yourlimit>,na.rm=TRUE),
>            !is.na(x),type="left")~<Yourmodel>) to do this on-the-fly).
> 
> Another possible idea is to split your (supposedly x) variable in two :
> observed (logical), and value (observed value if observed, <detection
> limit> if not) and include these two data in your model. You probably
> will run into numerical difficulties due to the (built-in total
> separation...).
> 
> HTH,
> 
> 					Emmanuel Charpentier
> 
>> Thanks for the help,
>> Todd
>>
>>
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

All see

@Article{zha09non,
   author =               {Zhang, Donghui and Fan, Chunpeng and Zhang, 
Juan and Zhang, {Cun-Hui}},
   title =                {Nonparametric methods for measurements below 
detection limit},
   journal =      Stat in Med,
   year =                 2009,
   volume =       28,
   pages =        {700-715},
   annote =       {lower limit of detection;left censoring;Tobit 
model;Gehan test;Peto-Peto test;log-rank test;Wilcoxon test;location 
shift model;superiority of nonparametric methods}
}



-- 
Frank E Harrell Jr   Professor and Chair           School of Medicine
                      Department of Biostatistics   Vanderbilt University




More information about the R-help mailing list