[R] Diag "Hat" matrix

Thu Jun 7 18:11:03 CEST 2001

Kenneth Cabrera <krcabrer at perseus.unalmed.edu.co> writes:

> Hi R users:
> 
> What is the difference between in the computation of the diag of the
> "hat" matrix in:
> "lm.influence" and the matrix operations with "solve()" and "t()"?
> 
> I mean, this is my X matrix
> 
>         x1    x2    x3    x4     x5
>  [1,] 0.297 0.310 0.290 0.220 0.1560
...
> [16,] 0.378 0.420 0.380 0.281 0.2000
> 
> If I use:
> diag(X%*%solve(t(X)%*%X)%*%t(X))
> I obtain:
>  [1] 0.15248181 0.27102872 0.11476375 0.12941386 0.90455886 0.32246292
>  [7] 0.43858581 0.16533854 0.37415984 0.19100227 0.17023090 0.15125134
> [13] 0.17855019 0.06023773 0.52137996 0.85455350
> 
> But when I use the lm.influence() function
> lm.influence(mt)$hat
> I obtain:
>  [1] 0.1735989 0.2999146 0.2334095 0.1455117 0.9216644 0.7553856
> 0.4486403
>  [8] 0.2755802 0.4188349 0.1914242 0.1790093 0.1573939 0.1787553
> 0.1975511
> [15] 0.5664988 0.8568274
> mt is a model of the type y~x1+x2+x3+x4+x5, where y is:
> y
> [1]  17  17  35  69  69 173 173  17  17  73  17  35  69  35  35  52
> 
> As you see the differences are no too small.
> Where is the problem? Is only a numerical stability problem?
> 
> Thank you very much for your help

The intercept? What if you use X2<-cbind(1,X) or y~x1+x2+x3+x4+x5-1

-- 
   O__  ---- Peter Dalgaard             Blegdamsvej 3  
  c/ /'_ --- Dept. of Biostatistics     2200 Cph. N   
 (*) \(*) -- University of Copenhagen   Denmark      Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)             FAX: (+45) 35327907
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._