[R] Diag "Hat" matrix
Peter Dalgaard BSA
p.dalgaard at biostat.ku.dk
Thu Jun 7 18:11:03 CEST 2001
Kenneth Cabrera <krcabrer at perseus.unalmed.edu.co> writes:
> Hi R users:
>
> What is the difference between in the computation of the diag of the
> "hat" matrix in:
> "lm.influence" and the matrix operations with "solve()" and "t()"?
>
> I mean, this is my X matrix
>
> x1 x2 x3 x4 x5
> [1,] 0.297 0.310 0.290 0.220 0.1560
...
> [16,] 0.378 0.420 0.380 0.281 0.2000
>
> If I use:
> diag(X%*%solve(t(X)%*%X)%*%t(X))
> I obtain:
> [1] 0.15248181 0.27102872 0.11476375 0.12941386 0.90455886 0.32246292
> [7] 0.43858581 0.16533854 0.37415984 0.19100227 0.17023090 0.15125134
> [13] 0.17855019 0.06023773 0.52137996 0.85455350
>
> But when I use the lm.influence() function
> lm.influence(mt)$hat
> I obtain:
> [1] 0.1735989 0.2999146 0.2334095 0.1455117 0.9216644 0.7553856
> 0.4486403
> [8] 0.2755802 0.4188349 0.1914242 0.1790093 0.1573939 0.1787553
> 0.1975511
> [15] 0.5664988 0.8568274
> mt is a model of the type y~x1+x2+x3+x4+x5, where y is:
> y
> [1] 17 17 35 69 69 173 173 17 17 73 17 35 69 35 35 52
>
> As you see the differences are no too small.
> Where is the problem? Is only a numerical stability problem?
>
> Thank you very much for your help
The intercept? What if you use X2<-cbind(1,X) or y~x1+x2+x3+x4+x5-1
--
O__ ---- Peter Dalgaard Blegdamsvej 3
c/ /'_ --- Dept. of Biostatistics 2200 Cph. N
(*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
More information about the R-help
mailing list