[R] residual standard error in rlm (MASS package)
Kari Ruohonen
kari.ruohonen at utu.fi
Mon Dec 8 07:23:03 CET 2008
Hi,
I would appreciate of someone could explain how the residual standard
error is computed for rlm models (MASS package). Usually, one would
expect to get the residual standard error by
> sqrt(sum((y-fitted(fm))^2)/(n-2))
where y is the response, fm a linear model with an intercept and slope
for x and n the number of observations. This does not seem to work for
rlm models and I am wondering what obvious am I missing here? Here is an
example:
x<-1:100
y <-
c(2.37156056743079, 1.66644749462933, 6.33155723966817,
12.7709430358167,
11.6666124950273, 19.7839679181322, 15.4923741347280, 18.702397068068,
18.7599963836891, 16.5916430986993, 16.0653054434192, 25.4517287910774,
19.9306544701024, 25.3581170063305, 35.6823980984208, 25.8293557856092,
34.7021243077337, 31.5336533511445, 36.3599764020412, 44.6000402205419,
41.9899219097128, 45.4564141342995, 43.6061038794823, 48.7566542867736,
47.5504015095432, 54.8120780105412, 55.2620894365424, 53.223516997263,
59.5477081631011, 61.2390445046623, 62.3106323086734, 68.1104058608567,
62.399184797047, 73.9413640517595, 70.6710955288097, 74.5456476513766,
64.968260562374, 73.2318014155102, 73.7335636549196, 76.9362454490887,
80.2579421621043, 80.945827481932, 87.7805234941603, 90.0909966936097,
86.0620664696943, 90.3640690887434, 98.0965832886435, 96.789139334781,
102.114606626867, 98.3302535449148, 103.107825932103, 109.942412367491,
106.868253017023, 109.808738425258, 110.136050155862, 108.846488332796,
118.442973085485, 117.276921857816, 118.640871017018, 119.263784892266,
123.100214564588, 123.860590728955, 128.712228721465, 131.297848895423,
123.283516322512, 134.012585073241, 132.665302554315, 138.673423711638,
143.687124396642, 139.159598404340, 142.012045172451, 146.480644634549,
145.429104228138, 144.503524323636, 152.348091257061, 149.237135977337,
159.803973361884, 153.195835890301, 158.921034703569, 163.479578254736,
159.591944778941, 163.185119145309, 165.890510577093, 164.573471319534,
173.549321320816, 169.520130741843, 170.439532597426, 174.477604263110,
178.059609946662, 177.828073866105, 185.005760822296, 184.280998437732,
196.085419590290, 187.125508176825, 190.524627542992, 196.849299652848,
197.830377226055, 197.973198490102, 198.59328678419, 199.450725602621
)
# y originally generated with y<-2*x+rnorm(100,0,2)
fm<-lm(y~x)
rm<-rlm(y~x)
fm.r<-sqrt(sum((y-fitted(fm))^2)/(n-2))
rm.r<-sqrt(sum((y-fitted(rm))^2)/(n-2))
print(matrix(c(fm.r,summary(fm)$sigma,rm.r,summary(rm)$sigma),
ncol=2,byrow=T))
Output of this is:
[,1] [,2]
[1,] 1.900033 1.900033
[2,] 1.905847 1.595128
I.e. for the lm model the residual standard error from the summary.lm
method matches exactly sqrt(sum((y-fitted(fm))^2)/(n-2)) but that for
the summary.rlm model is somewhat smaller than
sqrt(sum((y-fitted(rm))^2)/(n-2)). I am curious what causes this
difference?
My sessionInfo()
R version 2.7.1 (2008-06-23)
x86_64-pc-linux-gnu
locale:
LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods
base
other attached packages:
[1] MASS_7.2-44
regards, Kari Ruohonen
More information about the R-help
mailing list