[R] Regression Column names instead of numbers
arun
smartpink111 at yahoo.com
Fri Aug 2 19:18:54 CEST 2013
You could try:
set.seed(25)
mt1<- matrix(sample(c(NA,1:40),20*200,replace=TRUE),ncol=200)
colnames(mt1)<- paste0("X",1:200)
set.seed(487)
mt2<- matrix(sample(c(NA,1:80),20*200,replace=TRUE),ncol=200)
colnames(mt2)<- colnames(mt1)
res<-lapply(colnames(mt1),function(x) {x1<-data.frame(mt1[,x],mt2[,x]); colnames(x1)<-paste0(c("mt1","mt2"),x); summary(lm(as.formula(paste(colnames(x1)[1],"~",colnames(x1)[2],sep="")),data=x1))})
res
[[1]]
Call:
lm(formula = as.formula(paste(colnames(x1)[1], "~", colnames(x1)[2],
sep = "")), data = x1)
Residuals:
Min 1Q Median 3Q Max
-16.799 -8.821 -1.059 8.414 19.544
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 14.7292 6.2952 2.34 0.031 *
mt2X1 0.1302 0.1342 0.97 0.345
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 11.53 on 18 degrees of freedom
Multiple R-squared: 0.04967, Adjusted R-squared: -0.003127
F-statistic: 0.9408 on 1 and 18 DF, p-value: 0.3449
[[2]]
Call:
lm(formula = as.formula(paste(colnames(x1)[1], "~", colnames(x1)[2],
sep = "")), data = x1)
Residuals:
Min 1Q Median 3Q Max
-17.641 -6.809 -2.255 5.235 19.684
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 3.7745 5.1715 0.730 0.4754
mt2X2 0.2635 0.1155 2.283 0.0356 *
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 10.15 on 17 degrees of freedom
(1 observation deleted due to missingness)
Multiple R-squared: 0.2346, Adjusted R-squared: 0.1896
F-statistic: 5.21 on 1 and 17 DF, p-value: 0.0356
A.K.
----- Original Message -----
From: TMiller <thomas.mueller at student.unisg.ch>
To: r-help at r-project.org
Cc:
Sent: Friday, August 2, 2013 11:16 AM
Subject: [R] Regression Column names instead of numbers
Hi guys
I am new to R and I am currently trying to do a regression:
I have two matrices with 200 time series each.
In order to achieve a loop, I used the following command:
sapply(1:200, function(x) summary(lm(formula=matrix1[,x]~matrix2[,x])))
Each column/time series has a unique name, in case of Matrix 1 I have 200
cities, in case of Matrix 2 I have 200 stocks.
However, if I run the command I get the following result:
[[1]]
Call:
lm(formula = matrix1[, x] ~ matrix2[, x])
Residuals:
Min 1Q Median 3Q Max
-134.9 -68.6 -32.8 33.2 261.2
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 525.2356 69.8059 7.52 9.1e-10 ***
matrix2[, x] 0.0640 0.0161 3.98 0.00023 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 113 on 50 degrees of freedom
(41 observations deleted due to missingness)
Multiple R-squared: 0.24, Adjusted R-squared: 0.225
F-statistic: 15.8 on 1 and 50 DF, p-value: 0.000226
[[2]]
Call:
lm(formula = matrix1[, x] ~ matrix2[, x])
Residuals:
Min 1Q Median 3Q Max
-914.9 -393.3 -76.9 243.3 1304.7
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 3.33e+03 1.88e+02 17.70 < 2e-16 ***
matrix2[, x] 4.10e-01 7.87e-02 5.21 3.4e-06 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 531 on 51 degrees of freedom
(40 observations deleted due to missingness)
Multiple R-squared: 0.348, Adjusted R-squared: 0.335
F-statistic: 27.2 on 1 and 51 DF, p-value: 3.4e-06
Instead of the X's in the call response, I'd like to have the column name
(city and stock).
Is this by any means possible?
Thanks in advance for your help.
Best
Tom
--
View this message in context: http://r.789695.n4.nabble.com/Regression-Column-names-instead-of-numbers-tp4672904.html
Sent from the R help mailing list archive at Nabble.com.
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list