[RsR] Questions about interpreting lmRob output

Kjell Konis kon|@ @end|ng |rom @t@t@@ox@@c@uk
Wed Nov 14 17:08:46 CET 2007


The basic idea underlying the robust linear model is that some  
fraction (1-alpha > 0.5) of the data is distributed conditionally  
normal and the remaining fraction (alpha) comes from some arbitrary  
distribution (i.e., the outliers).  The goal of a robust method is to  
estimate the parameters (beta and sigma^2) of this conditional normal  
distribution without giving the outliers too much influence.  If the  
bulk of the data (aka the good data) is not distributed conditionally  
normal then a linear model is not appropriate regardless of whether it  
is fit robustly or not.  Of course you can still use all of the  
standard linear modeling tricks.  For instance a log transformation of  
the response sometimes helps with heteroskedasticity.

Kjell

On 14 Nov 2007, at 15:24, Jenifer Larson-Hall wrote:

> Thanks so much Kjell. Your response answers most of my questions.  
> Actually, I figured the overlaid plots things out (and the cool  
> fit.models function) by looking through the archives and finding  
> your pdf presentation that showed it (www.stats.ox.ac.uk/~konis/robust/ROBCLA2006-konis.pdf) 
> . That was very helpful!
>
> The documentation you sent me privately (Robust.pdf, documentation  
> for S-PLUS library) was helpful in clearing up a few more lingering  
> questions (I guess if others want it they can email you).

The Robust Library Users Guide (Robust.pdf) is included in the source  
version of the Robust Library.

> Just one more question now:
>
> My sense of robust methods was that they returned values which did  
> not make strict normality and homogeneity of variances assumptions.  
> In the data set I gave in my previous email, there is  
> heteroskedasticity and non-normality distribution of data. So from  
> what I understand from my reading, robust methods will give me a  
> better sense of what's going on in the bulk of my data than least- 
> squares estimates. If this is true, then what is the reason for  
> looking at diagnostic plots? If I find the data is still  
> heteroskedastic and non-normal in the plots after the robust  
> analysis, is this cause for worry?
>
>
> Dr. Jenifer Larson-Hall
> Assistant Professor of Linguistics
> University of North Texas
> (940)369-8950
>
>
>




More information about the R-SIG-Robust mailing list