[R] residual standard error in rlm (MASS package)

Prof Brian Ripley ripley at stats.ox.ac.uk
Mon Dec 8 09:31:38 CET 2008


The 'r' in rlm is for 'robust', so it does not compute a residual sum of 
squares (which is not robust), but rather a robust estimate of the scale.
That *is* what the help page ?summary.rlm says:

    sigma: The scale estimate.

   stddev: A scale estimate used for the standard errors.

As to exactly what those scale esimates are, see the references or the 
code, but e.g. stddev is based on downweighting larger residuals.


On Mon, 8 Dec 2008, Kari Ruohonen wrote:

> Hi,
> I would appreciate of someone could explain how the residual standard
> error is computed for rlm models (MASS package). Usually, one would
> expect to get the residual standard error by
>
>> sqrt(sum((y-fitted(fm))^2)/(n-2))
>
> where y is the response, fm a linear model with an intercept and slope
> for x and n the number of observations. This does not seem to work for
> rlm models and I am wondering what obvious am I missing here? Here is an
> example:
>
> x<-1:100
> y <-
> c(2.37156056743079, 1.66644749462933, 6.33155723966817,
> 12.7709430358167,
> 11.6666124950273, 19.7839679181322, 15.4923741347280, 18.702397068068,
> 18.7599963836891, 16.5916430986993, 16.0653054434192, 25.4517287910774,
> 19.9306544701024, 25.3581170063305, 35.6823980984208, 25.8293557856092,
> 34.7021243077337, 31.5336533511445, 36.3599764020412, 44.6000402205419,
> 41.9899219097128, 45.4564141342995, 43.6061038794823, 48.7566542867736,
> 47.5504015095432, 54.8120780105412, 55.2620894365424, 53.223516997263,
> 59.5477081631011, 61.2390445046623, 62.3106323086734, 68.1104058608567,
> 62.399184797047, 73.9413640517595, 70.6710955288097, 74.5456476513766,
> 64.968260562374, 73.2318014155102, 73.7335636549196, 76.9362454490887,
> 80.2579421621043, 80.945827481932, 87.7805234941603, 90.0909966936097,
> 86.0620664696943, 90.3640690887434, 98.0965832886435, 96.789139334781,
> 102.114606626867, 98.3302535449148, 103.107825932103, 109.942412367491,
> 106.868253017023, 109.808738425258, 110.136050155862, 108.846488332796,
> 118.442973085485, 117.276921857816, 118.640871017018, 119.263784892266,
> 123.100214564588, 123.860590728955, 128.712228721465, 131.297848895423,
> 123.283516322512, 134.012585073241, 132.665302554315, 138.673423711638,
> 143.687124396642, 139.159598404340, 142.012045172451, 146.480644634549,
> 145.429104228138, 144.503524323636, 152.348091257061, 149.237135977337,
> 159.803973361884, 153.195835890301, 158.921034703569, 163.479578254736,
> 159.591944778941, 163.185119145309, 165.890510577093, 164.573471319534,
> 173.549321320816, 169.520130741843, 170.439532597426, 174.477604263110,
> 178.059609946662, 177.828073866105, 185.005760822296, 184.280998437732,
> 196.085419590290, 187.125508176825, 190.524627542992, 196.849299652848,
> 197.830377226055, 197.973198490102, 198.59328678419, 199.450725602621
> )
> # y originally generated with y<-2*x+rnorm(100,0,2)
>
> fm<-lm(y~x)
> rm<-rlm(y~x)
> fm.r<-sqrt(sum((y-fitted(fm))^2)/(n-2))
> rm.r<-sqrt(sum((y-fitted(rm))^2)/(n-2))
> print(matrix(c(fm.r,summary(fm)$sigma,rm.r,summary(rm)$sigma),
>             ncol=2,byrow=T))
>
> Output of this is:
>         [,1]     [,2]
> [1,] 1.900033 1.900033
> [2,] 1.905847 1.595128
>
> I.e. for the lm model the residual standard error from the summary.lm
> method matches exactly sqrt(sum((y-fitted(fm))^2)/(n-2)) but that for
> the summary.rlm model is somewhat smaller than
> sqrt(sum((y-fitted(rm))^2)/(n-2)). I am curious what causes this
> difference?
>
> My sessionInfo()
> R version 2.7.1 (2008-06-23)
> x86_64-pc-linux-gnu
>
> locale:
> LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C
>
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods
> base
>
> other attached packages:
> [1] MASS_7.2-44
>
> regards, Kari Ruohonen

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595



More information about the R-help mailing list