[R] variable selection in linear regression
Syaiba Balqish
syaibabalqish at gmail.com
Tue Jun 7 10:40:01 CEST 2011
Hello
With due respect, have a nice time. I would like to ask some command in R.
It is regarding variable selection in linear regression.
In R, there is one rebuild function called "step" which
selecting variables according to AIC.
let say i have data [y, x1,x2,x3,x4]
we start with y~b0
i compute the partial F test and choose the variable
with maximum partial F to enter the model, let say
x4 with max value of partial F=58.02377.
therefore, our next model is y~b0+b4x4
my questions...
1.how should i write so that x4 will be added to the next step?
2. the formula for partial F test is
F*=(SSE(reduced model)-SSE(full model)/dfR-dfF) / (SSE(full model)/dfF)
which can be simply as
F*=MSR(xi | x1,x2,...,xi-1,xi+1) / MSE(x1,x2,...,xi-1,xi,xi+1)
If i would like to write my formula by simplified one, how can i write it
for every xi (not in the model) that need to be selected with conditionally
depend on other x's (in the model)
let say , i want to select other variables (x1, x2, x3) after x4 is
selected
F*=MSR(x3|x4)/MSE(x3,x4)
Below, i attach my simple code
p <- dim(mydata)[2]
d <- p-1
n <- dim(mydata)[1]
x <- as.matrix(mydata[,2:p])
y <- as.matrix(mydata[,1])
X <- as.matrix(rep(1,n))
b <- lm(y~1,data=mydata)$coefficients
yhat <- X%*%b
res <- y-yhat
sigma.hat <- sqrt(sum(res^2)/(n-ncol(X)))
cv <- sigma.hat^2*ginv(t(X)%*%X)
se <- sqrt(diag(cv))
pc <- matrix(0,nrow=1,ncol=d)
resF <- matrix(0, nrow=n, ncol=d)
pf <- matrix(0, nrow=1, ncol=d)
for(j in 1:d){
pc[,j] <- cor(x=(x[,j]), y=(mydata[,1]))
resF[,j] <- lsfit(x[,j], y)$residuals
sseF <- t(as.matrix(apply(resF^2, 2, sum)))
resR <- lm(y~1,data=mydata)$residuals
sseR <- sum(resR^2)
dfF <- n-2
dfR <- n-1
pf[,j] <- ((sseR-sseF[,j])/(dfR-dfF))/(sseF[,j]/dfF)
max.pf=max(pf)
max.pc=max(pc)
Thank you and looking forward to hear some replies.
Sincerely,
Iba
Universiti Putra Malaysia
--
View this message in context: http://r.789695.n4.nabble.com/variable-selection-in-linear-regression-tp3578967p3578967.html
Sent from the R help mailing list archive at Nabble.com.
More information about the R-help
mailing list