[R] regression analysis in R
arun
smartpink111 at yahoo.com
Fri Oct 26 23:47:00 CEST 2012
HI,
May be this helps.
set.seed(8)
mat1<-matrix(sample(150,90,replace=FALSE),ncol=9,nrow=10)
dat1<-data.frame(mat1)
set.seed(10)
B<-sample(150:190,10,replace=FALSE)
res1<-lapply(dat1,function(x) lm(B~as.matrix(x)))
#or
res1<-lapply(dat1,function(x) lm(B~x))
res1Summary<-lapply(res1,summary)
#to get the coefficients
res1SummaryCoef<-lapply(res1,function(x) summary(x)$coefficients)
res1SummaryCoef[1:3]
#$X1
# Estimate Std. Error t value Pr(>|t|)
#(Intercept) 150.1303702 8.45536736 17.755630 1.035959e-07
#as.matrix(x) 0.2126583 0.09304937 2.285436 5.163141e-02
#
#$X2
# Estimate Std. Error t value Pr(>|t|)
#(Intercept) 168.219302287 6.9904434 24.06418202 9.479720e-09
#as.matrix(x) -0.002386046 0.1146838 -0.02080544 9.839104e-01
#
#$X3
# Estimate Std. Error t value Pr(>|t|)
#(Intercept) 180.303999 8.6675156 20.802270 2.990115e-08
#as.matrix(x) -0.157268 0.1021179 -1.540064 1.621101e-01
#to get pvalue of Fstatistic
res1pvalueF<-lapply(res1,function(x) pf(summary(x)$fstatistic[1],summary(x)$fstatistic[2],summary(x)$fstatistic[3],lower.tail=FALSE))
#to get r.squared value
res1rSquare<-lapply(res1,function(x) summary(x)$r.squared)
#2nd part
#Create some new datasets using random combination of columns from dat1
dat2<-dat1[,sample(names(dat1),4)]
dat3<-dat1[,sample(names(dat1),4)]
dat4<-dat1[,sample(names(dat1),4)]
dat5<-dat1[,sample(names(dat1),4)]
dat6<-dat1[,sample(names(dat1),4)]
head(dat2)
# X7 X3 X8 X5
#1 85 30 113 100
#2 89 53 115 32
#3 74 79 63 54
#4 57 28 52 94
#5 6 84 135 132
#6 5 123 146 127
head(dat3)
# X8 X2 X6 X3
#1 113 64 14 30
#2 115 13 7 53
#3 63 60 15 79
#4 52 75 34 28
#5 135 19 107 84
#6 146 126 27 123
#create a list of dataframes
list1<-list(dat2,dat3,dat4,dat5,dat6)
res2<-lapply(list1,function(x) lm(B~as.matrix(x)))
res2rSquare<-lapply(res2,function(x) summary(x)$r.squared)
unlist(res2rSquare)
#[1] 0.8444332 0.6316695 0.6971695 0.7322519 0.4328805
For selection of the best model based on combination of descriptors, you can also look for step-wise elimination, or based on AIC or BIC values.
A.K.
----- Original Message -----
From: eliza botto <eliza_botto at hotmail.com>
To: "r-help at r-project.org" <r-help at r-project.org>
Cc:
Sent: Friday, October 26, 2012 4:00 PM
Subject: [R] regression analysis in R
Dear useRs,
i have vectors of about 27 descriptors, each having 703 elements. what i want to do is the following 1. i want to do regression analysis of these 27 vectors individually, against a dependent vector, say B, having same number of elements.2. i would like to know best 10 regression results, if i do regression analysis of dependent vector against the random combination of any 4 descriptors. more precisely, in the first step we did regression of dependent vector against individual vector of each descriptor, but now we want R to randomly combine descriptors in a set of 4 and does regression analysis with B to see what are top 10 combination of descriptors giving good regression results with B? i hope i am clear. i know 2nd part is more tricky, but i will be extremely happy if you can answer any one of the above questions.
thanks in advanceeliza
[[alternative HTML version deleted]]
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list