[R] regression analyses using a vector of means anda variance-covariance matrix
John Fox
jfox at mcmaster.ca
Sat Oct 14 23:15:06 CEST 2006
Dear John,
I'm not aware of an existing function that will do what you want
conveniently, but it's not hard to write one:
reg <- function(V, means, y, x, n){
# V: covariance matrix of variables
# means: vector of means
# y: index of response in V
# x: indices of regressors in V
SSP <- (n - 1)*V + n*outer(means, means)
SSP <- cbind(n*means, SSP)
SSP <- rbind(c(n, n*means), SSP)
XtXinv <- solve(SSP[c(1,x+1), c(1,x+1)])
b <- as.vector(XtXinv %*% SSP[c(1,x+1), y+1])
R2 <- as.vector(V[y,x] %*% solve(V[x,x]) %*% V[x,y])/V[y,y]
s2 <- (1 - R2)*V[y,y]*(n - 1)/(n - length(x) - 1)
Vb <- s2*XtXinv
list(coef=b, vcov=Vb, R2=R2, s2=s2, SSP=SSP)
}
For example,
> library(car)
> data <- Duncan[,c("prestige", "income", "education")]
> reg(var(data), colMeans(data), 1, 2:3, nrow(data))
$coef
[1] -6.0646629 0.5987328 0.5458339
$vcov
income education
18.2494814 -0.15184501 -0.150706025
income -0.1518450 0.01432027 -0.008518551
education -0.1507060 -0.00851855 0.009653582
$R2
[1] 0.8281734
$s2
[1] 178.7309
$SSP
prestige income education
45 2146 1884 2365
prestige 2146 146028 118229 147936
income 1884 118229 105148 122197
education 2365 147936 122197 163265
> summary(lm(prestige ~ income + education, data=Duncan)) # check
Call:
lm(formula = prestige ~ income + education, data = Duncan)
Residuals:
Min 1Q Median 3Q Max
-29.5380 -6.4174 0.6546 6.6051 34.6412
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -6.06466 4.27194 -1.420 0.163
income 0.59873 0.11967 5.003 1.05e-05 ***
education 0.54583 0.09825 5.555 1.73e-06 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 13.37 on 42 degrees of freedom
Multiple R-Squared: 0.8282, Adjusted R-squared: 0.82
F-statistic: 101.2 on 2 and 42 DF, p-value: < 2.2e-16
Some caveats: This function will work OK for well conditioned data, since
I'm inverting the matrix of sums of squares and products; one could take a
more sophisticated approach. As well, there's a bit of redundancy in the
computation of R2, but I didn't have time to figure out how to avoid it.
Finally, there's an obvious advantage to computing on the original data --
e.g., one can get residuals.
I hope this helps,
John
--------------------------------
John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox
--------------------------------
> -----Original Message-----
> From: r-help-bounces at stat.math.ethz.ch
> [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of John Sorkin
> Sent: Saturday, October 14, 2006 3:27 PM
> To: r-help at stat.math.ethz.ch
> Subject: [R] regression analyses using a vector of means anda
> variance-covariance matrix
>
> R 2.2.0
> windows XP
>
> How can I perform a regression analyses using a vector of
> means, a variance-covariance matrix? I looked at the help
> screen for lm and did not see any option for using the afore
> mentioned structures as input to lm.
> Thanks,
> John
>
> John Sorkin M.D., Ph.D.
> Chief, Biostatistics and Informatics
> Baltimore VA Medical Center GRECC,
> University of Maryland School of Medicine Claude D. Pepper
> OAIC, University of Maryland Clinical Nutrition Research
> Unit, and Baltimore VA Center Stroke of Excellence
>
> University of Maryland School of Medicine Division of
> Gerontology Baltimore VA Medical Center 10 North Greene
> Street GRECC (BT/18/GR) Baltimore, MD 21201-1524
>
> (Phone) 410-605-7119
> (Fax) 410-605-7913 (Please call phone number above prior to
> faxing) jsorkin at grecc.umaryland.edu
>
> Confidentiality Statement:
> This email message, including any attachments, is for the\...{{dropped}}
More information about the R-help
mailing list