[R] Robust linear models and unequal variance

Bert Gunter gunter.berton at gene.com
Wed Sep 5 00:23:34 CEST 2007


Let me try a reply, although I wish others wiser than I had responded.

1. How do you know the variances are unequal? 

2. If you somehow know what the variances are (or at least their relative
sizes), you can use the "weights" arguments of the functions you mentions to
weight inversely proportional to variance (except not for the "MM" method in
rlm() according to the docs.) 

3. That "ranked regression" is robust is a myth. It also does not deal with
the unequal variance situation. It is not a panacea for anything. If you
need "robust" regression use robust regression.

4. If group sizes are not too dissimilar, than whether you case weight or
not may not make much difference (alas, hard to tell a priori). Especially
to estimation.

The fundamental issue is that "outliers" and "unequal variances" must be
operationalized, otherwise they are confounded: "outlier" only has meaning
compared to what is expected from a specified distribution. Outliers are no
longer out when the variance is "large." 

Also look at glm() with the "quasi" option if you wish to consider fitting a
heterogeneous variance structure to initialize a robust method (which could,
of course, be distorted by your "outliers").


Bert Gunter
Genentech Nonclinical Statistics

-----Original Message-----
From: r-help-bounces at stat.math.ethz.ch
[mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Geertje Van der
Heijden
Sent: Tuesday, September 04, 2007 10:55 AM
To: r-help at stat.math.ethz.ch
Subject: [R] Robust linear models and unequal variance

Hi all,

I have probably a basic question, but I can't seem to find the answer in
the literature or in the R-archives. 

I would like to do a robust ANCOVA (using either rlm or lmRob of the
MASS and robust packages) - my response variable deviates slightly from
normal and I have some "outliers". The data consist of 2 factor
variables and 3-5 covariates (fdepending on the model). However, the
variance between my groups is not equal and I am not sure if it is
therefore appropriate to use a robust statistical method or if a
non-parametric analysis (i.e. ranked regression) might be better. If I
can still use a robust statistical method, which estimator is best to
use to deal with unequal variance? And if it is better to use a
non-parametric analysis, could anyone put me in the direction of the
right non-parametric method to use (the relationship between my response
variable and the covariates is linear)?

Any help on this would be greatly appreciated!

Many thanks,
Geertje

~~~~
Geertje van der Heijden
PhD student
Tropical Ecology
School of Geography
University of Leeds
Leeds LS2 9JT

Tel: (+44)(0)113 3433345 
Email: g.m.f.vanderheijden04 at leeds.ac.uk



	[[alternative HTML version deleted]]

______________________________________________
R-help at stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list