[R] Matrix algebra in R to compute coefficients of a linear regression.

Sat Feb 18 14:59:32 CET 2012

Mark
Thank you!
John

John David Sorkin M.D., Ph.D.
Chief, Biostatistics and Informatics
University of Maryland School of Medicine Division of Gerontology
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to faxing)

>>> Mark Leeds <markleeds2 at gmail.com> 2/18/2012 8:55 AM >>>
Hi John: I don't understand what you're doing ( not saying that it's wrong.
I just don't
follow it ). Below is code for computing the coefficients using the matrix
way I follow.
Others may understand what you're doing and be able to fix it so I wouldn't
just
use below immediately.

xprimex <- solve(t(data[,1:2]) %*% data[,1:2])
xprimey <- t(data[,1:2]) %*% data[,3]

betas <- xprimex %*% xprimey
print(betas)

On Sat, Feb 18, 2012 at 8:36 AM, John Sorkin <JSorkin at grecc.umaryland.edu>wrote:

> I am trying to use matrix algebra to get the beta coefficients from a
> simple bivariate linear regression, y=f(x).
> The coefficients should be computable using the following matrix algebra:
> t(X)Y / t(x)X
>
> I have pasted the code I wrote below. I clearly odes not work both because
> it returns a matrix rather than a vector containing two elements the beta
> for the intercept and the beta for x, and because the values produced by
> the matrix algebra are not the same as those returned by the linear
> regression. Can someone tell we where I have gone wrong, either in my use
> of matrix algebra in R, or perhaps at a more fundamental theoretical level?
> Thanks,
> John
>
> # Define intercept, x and y.
> int <- rep(1,100)
> x   <- 1:100
> y   <- 2*x + rnorm(100)
>
> # Create a matrix to hold values.
> data           <- matrix(nrow=100,ncol=3)
> dimnames(data) <- list(NULL,c("int","x","y"))
> data[,"int"] <- int
> data[,"x"]   <- x
> data[,"y"]   <- y
> data
>
> # Compute numerator.
> num <-  cov(data)
> num
>
> # Compute denominator
> denom <- solve(t(data) %*% data)
> denom
>
> # Compute betas, [t(X)Y]/[t(X)Y]
> betaRon <-    num %*% denom
> betaRon
>
> # Get betas from regression so we can check
> # values obtaned by matrix algebra.
> fit0 <- lm(y~x)
>
>
> John David Sorkin M.D., Ph.D.
> Chief, Biostatistics and Informatics
> University of Maryland School of Medicine Division of Gerontology
> Baltimore VA Medical Center
> 10 North Greene Street
> GRECC (BT/18/GR)
> Baltimore, MD 21201-1524
> (Phone) 410-605-7119
> (Fax) 410-605-7913 (Please call phone number above prior to faxing)
>
> Confidentiality Statement:
> This email message, including any attachments, is for ...{{dropped:17}}