[R] conservative robust estimation in (nonlinear) mixed models

Fri Mar 24 02:32:36 CET 2006

	  Bert raised an issue I had overlooked.  Ideally, we would like to be 
able to specify a different "family" for the observations and for each 
random effect, with Student's t and contaminated normal as valid options 
in both places.

	  If I were allowed to specify a family (or a robust family) for either 
observations or for random effects but not both, I think I'd pick the 
observations.  I don't know, but I wonder if misspecification of the 
observation distribution might create more problems with estimation and 
inference than misspecification of the distribution of a random effect. 
  As Bert indicated, there may be identifiability issues here, and the 
choice of a model may depend on one's hypotheses about the situation 
being modeled.

	  spencer graves

Berton Gunter wrote:

> Ok, since Spencer has dived in,I'll go public (I made some prior private
> remarks to David because I didn't think they were worth wasting the list's
> bandwidth on. Heck, they may still not be...)
> 
> My question: isn't the difficult issue which levels of the (co)variance
> hierarchy get longer tailed distributions rather than which distributions
> are used to model ong tails? Seems to me that there is an inherent
> identifiability issue here, and even more so with nonlinear models. It's
> easy to construct examples where it all essentially depends on your priors.
> 
> Cheers,
> Bert
> 
> -- Bert Gunter
> Genentech Non-Clinical Statistics
> South San Francisco, CA
>   
>  
> 
> 
>>-----Original Message-----
>>From: r-help-bounces at stat.math.ethz.ch 
>>[mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Spencer Graves
>>Sent: Thursday, March 23, 2006 12:34 PM
>>To: otter at otter-rsch.com
>>Cc: r-help at stat.math.ethz.ch
>>Subject: Re: [R] conservative robust estimation in 
>>(nonlinear) mixed models
>>
>>	  I know of two fairly common models for robust 
>>methods.  One is the 
>>contaminated normal that you mentioned.  The other is Student's t.  A 
>>normal plot of the data or of residuals will often indicate 
>>whether the 
>>assumption of normality is plausible or not;  when the plot indicates 
>>problems, it will often also indicate whether a contaminated 
>>normal or 
>>Student's t would be better.
>>
>>	  Using Student's t introduces one additional parameter.  A 
>>contaminated normal would introduce 2;  however, in many 
>>applications, 
>>the contamination proportion (or its logit) will often b highly 
>>correlated with the ratio of the contamination standard deviation to 
>>that of the central portion of the distribution.  Thus, in 
>>some cases, 
>>it's often wise to fix the ratio of the standard deviations 
>>and estimate 
>>only the contamination proportion.
>>
>>	  hope this helps.
>>	  spencer graves
>>
>>dave fournier wrote:
>>
>>
>>>Conservative robust estimation methods do not appear to be
>>>currently available in the standard mixed model methods for R,
>>>where by conservative robust estimation I mean methods which
>>>work almost as well as the methods based on assumptions of
>>>normality when the assumption of normality *IS* satisfied.
>>>
>>>We are considering adding such a conservative robust 
>>
>>estimation option
>>
>>>for the random effects to our AD Model Builder mixed model package,
>>>glmmADMB, for R, and perhaps extending it to do robust 
>>
>>estimation for 
>>
>>>linear mixed models at the same time.
>>>
>>>An obvious candidate is to assume something like a mixture of
>>>normals. I have tested this in a simple linear mixed model
>>>using 5% contamination with  a normal with 3 times the standard 
>>>deviation, which seems to be
>>>a common assumption. Simulation results indicate that when the
>>>random effects are normally distributed this estimator is about
>>>3% less efficient, while when the random effects are 
>>
>>contaminated with
>>
>>>5% outliers  the estimator is about 23% more efficient, where by 23%
>>>more efficient I mean that one would have to use a sample size about
>>>23% larger to obtain the same size confidence limits for the
>>>parameters.
>>>
>>>Question?
>>>
>>>I wonder if there are other distributions besides a mixture 
>>
>>or normals. 
>>
>>>which might be preferable. Three things to keep in mind are:
>>>
>>>    1.)  It should be likelihood based so that the standard 
>>
>>likelihood
>>
>>>          based tests are applicable.
>>>
>>>    2.)  It should work well when the random effects are normally
>>>         distributed so that things that are already fixed don't get
>>>         broke.
>>>
>>>    3.)  In order to implement the method efficiently it is 
>>
>>necessary to
>>
>>>         be able to produce code for calculating the inverse of the
>>>         cumulative distribution function. This enables one 
>>
>>to extend
>>
>>>         methods based one the Laplace approximation for the random
>>>         effects (i.e. the Laplace approximation itself, adaptive
>>>         Gaussian integration, adaptive importance 
>>
>>sampling) to the new
>>
>>>         distribution.
>>>
>>>      Dave
>>>
>>
>>______________________________________________
>>R-help at stat.math.ethz.ch mailing list
>>https://stat.ethz.ch/mailman/listinfo/r-help
>>PLEASE do read the posting guide! 
>>http://www.R-project.org/posting-guide.html
>>
> 
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html