[R] problem with generalized least square, and lm.gls() bug
Peter Dalgaard BSA
p.dalgaard at biostat.ku.dk
Thu Jun 21 20:06:22 CEST 2001
Naoki Takebayashi <ntakebay at bio.indiana.edu> writes:
> I'm very confused about generalized least square method (GLS), so
> please help me! I'm trying to use GLS with R to estimate an intercept and
> slope; vector, b = (intercept, slope).
>
> My understanding of GLS is like this:
> y = X %*% b + e, where Var(e) is symmetric dispersion matrix
> (covariance-variance matrix); V (which is theoretically known in my
> case). We can decompose V into t(R) %*% R, where R is an upper
> triangular, square-root matrix of V, and t(R) is tranpose of R. Then
> if we call T = Inverse(t(R)), we can multply both side of equation to
> make the problem as a ordinary least square.
>
> T %*% y = T %*% X %*% b + T * e
>
> To find T, "solve(t(chol(V))" should do it. However, when I looked at
> lm.gls() from MASS library, it uses other method of obtaining T. From
> my understanding, when I give V (my dispersion matrix) to the argument
> "W=" of lm.gls, I should also give "inverse = TRUE" (can anyone
> confirm that I should be using this option?). Then it uses the
> following method to obtain T.
>
> eW <- eigen(V, TRUE)
> T <- diag(eW$values ^ (-0.5)) %*% t(eW$vector)
>
> I was checking if these two methods give same answer with a small
> matrix; so I tried this:
>
> > tempSqRt <- matrix(c(7,0,0, 19, 76, 0, 31, 94, 51), c(3,3))
> > temp <- t(tempSqRt) %*% tempSqRt # this is a symmetric dispersion
> matrix
> > temp.e <- eigen(temp, TRUE)
> > temp.T <- diag(temp.e$values ^ (-0.5)) %*% t(temp.e$vector)
> > temp.T
> [,1] [,2] [,3]
> [1,] 0.0001090669 0.0042107823 0.006247486
> [2,] 0.0004108533 -0.0272660081 0.018370016
> [3,] 0.1487442294 0.0003382315 -0.002824703
>
> > solve(t(chol(temp)))
> [,1] [,2] [,3]
> [1,] 0.14285714 -5.569319e-18 3.101497e-17
> [2,] -0.03571429 1.315789e-02 -6.217046e-18
> [3,] -0.02100840 -2.425181e-02 1.960784e-02
>
> So these two methods do NOT give the same answer. I guess I'm
> missunderstanding something. Could someone point out what's wrong?
The T matrix is not uniquely defined. Any matrix for which
crossprod(T) = solve(temp) will do. It doesn't have to be the Choleski
triangular factor. (And since R/S aren't good at using the upper-lower
diagonal matrix properties, any speed advantage of a Choleski-based
method is lost anyway.)
Check:
> solve(temp)
[,1] [,2] [,3]
[1,] 2.212503e-02 3.956691e-05 -0.0004119295
[2,] 3.956691e-05 7.612803e-04 -0.0004755256
[3,] -4.119295e-04 -4.755256e-04 0.0003844675
> crossprod(temp.T)
[,1] [,2] [,3]
[1,] 2.212503e-02 3.956691e-05 -0.0004119295
[2,] 3.956691e-05 7.612803e-04 -0.0004755256
[3,] -4.119295e-04 -4.755256e-04 0.0003844675
> temp.C <- solve(t(chol(temp)))
> crossprod(temp.C)
[,1] [,2] [,3]
[1,] 2.212503e-02 3.956691e-05 -0.0004119295
[2,] 3.956691e-05 7.612803e-04 -0.0004755256
[3,] -4.119295e-04 -4.755256e-04 0.0003844675
--
O__ ---- Peter Dalgaard Blegdamsvej 3
c/ /'_ --- Dept. of Biostatistics 2200 Cph. N
(*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
More information about the R-help
mailing list