[R] residual standard error in rlm (MASS package)

Kari Ruohonen kari.ruohonen at utu.fi
Mon Dec 8 07:23:03 CET 2008


Hi,
I would appreciate of someone could explain how the residual standard
error is computed for rlm models (MASS package). Usually, one would
expect to get the residual standard error by

> sqrt(sum((y-fitted(fm))^2)/(n-2))

where y is the response, fm a linear model with an intercept and slope
for x and n the number of observations. This does not seem to work for
rlm models and I am wondering what obvious am I missing here? Here is an
example:

x<-1:100
y <-
c(2.37156056743079, 1.66644749462933, 6.33155723966817,
12.7709430358167, 
11.6666124950273, 19.7839679181322, 15.4923741347280, 18.702397068068, 
18.7599963836891, 16.5916430986993, 16.0653054434192, 25.4517287910774, 
19.9306544701024, 25.3581170063305, 35.6823980984208, 25.8293557856092, 
34.7021243077337, 31.5336533511445, 36.3599764020412, 44.6000402205419, 
41.9899219097128, 45.4564141342995, 43.6061038794823, 48.7566542867736, 
47.5504015095432, 54.8120780105412, 55.2620894365424, 53.223516997263, 
59.5477081631011, 61.2390445046623, 62.3106323086734, 68.1104058608567, 
62.399184797047, 73.9413640517595, 70.6710955288097, 74.5456476513766, 
64.968260562374, 73.2318014155102, 73.7335636549196, 76.9362454490887, 
80.2579421621043, 80.945827481932, 87.7805234941603, 90.0909966936097, 
86.0620664696943, 90.3640690887434, 98.0965832886435, 96.789139334781, 
102.114606626867, 98.3302535449148, 103.107825932103, 109.942412367491, 
106.868253017023, 109.808738425258, 110.136050155862, 108.846488332796, 
118.442973085485, 117.276921857816, 118.640871017018, 119.263784892266, 
123.100214564588, 123.860590728955, 128.712228721465, 131.297848895423, 
123.283516322512, 134.012585073241, 132.665302554315, 138.673423711638, 
143.687124396642, 139.159598404340, 142.012045172451, 146.480644634549, 
145.429104228138, 144.503524323636, 152.348091257061, 149.237135977337, 
159.803973361884, 153.195835890301, 158.921034703569, 163.479578254736, 
159.591944778941, 163.185119145309, 165.890510577093, 164.573471319534, 
173.549321320816, 169.520130741843, 170.439532597426, 174.477604263110, 
178.059609946662, 177.828073866105, 185.005760822296, 184.280998437732, 
196.085419590290, 187.125508176825, 190.524627542992, 196.849299652848, 
197.830377226055, 197.973198490102, 198.59328678419, 199.450725602621
)
# y originally generated with y<-2*x+rnorm(100,0,2)

fm<-lm(y~x)
rm<-rlm(y~x)
fm.r<-sqrt(sum((y-fitted(fm))^2)/(n-2))
rm.r<-sqrt(sum((y-fitted(rm))^2)/(n-2))
print(matrix(c(fm.r,summary(fm)$sigma,rm.r,summary(rm)$sigma),
             ncol=2,byrow=T))

Output of this is:
         [,1]     [,2]
[1,] 1.900033 1.900033
[2,] 1.905847 1.595128

I.e. for the lm model the residual standard error from the summary.lm
method matches exactly sqrt(sum((y-fitted(fm))^2)/(n-2)) but that for
the summary.rlm model is somewhat smaller than
sqrt(sum((y-fitted(rm))^2)/(n-2)). I am curious what causes this
difference?

My sessionInfo()
R version 2.7.1 (2008-06-23) 
x86_64-pc-linux-gnu 

locale:
LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods
base     

other attached packages:
[1] MASS_7.2-44

regards, Kari Ruohonen



More information about the R-help mailing list