[R] Automatic saving of many regression's output

arun smartpink111 at yahoo.com
Thu Nov 28 01:26:36 CET 2013


HI,

Just tried ncvTest() and durbinWatsonTest() from library(car)


f4 <- function(meanmod, dta, varmod) {
assign(".dta", dta, envir=.GlobalEnv)
assign(".meanmod", meanmod, envir=.GlobalEnv)
m1 <- lm(.meanmod, .dta)
ans <- ncvTest(m1, varmod)
remove(".dta", envir=.GlobalEnv)
remove(".meanmod", envir=.GlobalEnv)
ans
}
library(car)
 lst3 <- lapply(lst1[sapply(lst1,function(x) !(all(rowSums(is.na(x))>0)))],function(x) f4(rate~., x))
 lst4 <- lapply(lst1[sapply(lst1,function(x) !(all(rowSums(is.na(x))>0)))],function(x) durbinWatsonTest(lm(rate~., x)))
?jarque.bera.test() from library(tseries) is applied on a numeric vector or time series. 

A.K.





On Wednesday, November 27, 2013 6:38 PM, arun <smartpink111 at yahoo.com> wrote:
Hi,

2. You need to tell which package you are using.

3. Does this work for you?
capture.output(lst2,file="nooldor.txt")

4. 


lst2
<- lapply(lst1[sapply(lst1,function(x) 
!(all(rowSums(is.na(x))>0)))],function(x) 
print(summary(lm(rate~.,data=x)))  ###prints the output on R console

A.K.


Hi,

Thank you for patience and help :-)

now the code looks like that:


data<-read.table("reg3-dane.csv", head=T, sep=";", dec=",")
>data$indx <- as.numeric(gl(334*123,123,334*123))
>lst1
<- split(data[,-16],data[,16]) # 1. by changing "16" parameter I can
add or remove variables (also by modyfing the "reg3-dane.csv" file), 
right?
>any(sapply(lst1,nrow)!=123)
>#[1] FALSE
>lst2 <- lapply(lst1[sapply(lst1,function(x) !(all(rowSums(is.na(x))>0)))],function(x) summary(lm(rate~cap.log+liqamih.log+pbv,data=x)) )
>length(lst2)
# 2.where I can place the test for each (from 123) regression like 
jarque.bera.test() 
vif() 
ncvTest() 
durbinWatsonTest() to have it saved with regression summary? and 3. how 
to get those list with results more user-friendly? I would like to get 
the report  
>#[1] 334  
>

is it ok?

Could you help me with the questions in remarks above?

And could you modify the script to also print the summary (and tests) of each regression (each of 123) in console?


Best wishes!
T.S.




On Wednesday, November 27, 2013 5:49 PM, arun <smartpink111 at yahoo.com> wrote:



Hi,

lst1[[1]][,2] <- NA
lst2 <- lapply(lst1,function(x) summary(lm(rate~.,data=x)))
Error in lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) : 
  0 (non-NA) cases



lst2 <- lapply(lst1[sapply(lst1,function(x) !(all(rowSums(is.na(x))>0)))],function(x) summary(lm(rate~.,data=x)) )
A.K.



Hi,

thank you for help. :-)

I applied your script to the data but I have got the error:

Error
in lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) :  0 
(non-NA) casesI forget to write that some of the data are NA.

I executed this code:

lst1 <- split(data[,-16],data[,16])
>any(sapply(lst1,nrow)!=123)
>#[1] FALSE
>lst2
<- lapply(lst1,function(x) 
summary(lm(rate~cap.log+liqamih.log+pbv,data=x))) # here I can set the 
dependent variables if I  want to test different versions of the model 
(e.g with only e dependent variables), right?
>length(lst2)
>#[1] 334
>





On Wednesday, November 27, 2013 5:27 PM, arun <smartpink111 at yahoo.com> wrote:
Hi,
Try:
set.seed(49)
dat1 <- as.data.frame(matrix(sample(c(NA,1:50),41082*15,replace=TRUE),ncol=15))
 dat1$indx <- as.numeric(gl(334*123,123,334*123))
names(dat1)[1] <- "rate"
 lst1 <- split(dat1[,-16],dat1[,16])
any(sapply(lst1,nrow)!=123)
#[1] FALSE
lst2 <- lapply(lst1,function(x) summary(lm(rate~.,data=x)))
 length(lst2)
#[1] 334

A.K.

Hi all! 

I am very beginner in R so please excuse me some of the naive questions. I am learning. 
Here is description of my problem: 

I have database (in single csv file) 
                   characteristic_1    characteristic_2               ...          characteristic_49 
subject_1     |      c1_1_t=1             |   c2_1_t=1             ... |     c49_1_t=1 
subject_2     |      c1_2_t=1             |   c2_2_t=1             ... |     c49_2_t=1 
subject_3     |      c1_3_t=1             |   c2_3_t=1             ... |     c49_3_t=1 
... 
subject_334  |      c1_334_t=1         |   c2_334_t=1          ... |     c49_334_t=1 
subject_1     |      c1_1_t=2            |   c2_1_t=2              ... |     c49_1_t=2 
subject_2     |      c1_2_t=2            |   c2_2_t=2              ... |     c49_2_t=2 
subject_3     |      c1_3_t=2            |   c2_3_t=2              ... |     c49_3_t=2 
... 
subject_334  |      c1_3_t=2            |   c2_3_t=2              ... |     c49_3_t=2 

and so on ... till t (time) = 123 

so I have 334 subjects with 49 characteristics measured in 123 points of time. 

I would like to run 123 regressions (three kinds: lm, rlm and 
lmrob - for comparison reasons) each one for 334 subjects and 49 
dependent variables and after each regression (actually after conducting
each of the three regressions:lm, rlm and lmrob) I would like to save 
txt (or csv) file with results (summary) and some test* (each regression
can be named reg_1, reg_2 ... reg_123) for those regressions. 

To make things more clear: 
regressions would look like that: 

summary(lm(rate~cap.log+liqamih.log+liqwol.log+pbv.log+mom.log+ 
             +beta.wig+beta.wig.eq 
           +beta.sp 
           +beta.wig.macro 
           +beta.sp.macro 
           +beta.sentim.pl+beta.sentim.pl.ort 
           +beta.sentim.usa+beta.sentim.usa.ort, data=data)) 

the problem is how to make this lm() above for "rolling window" 
id est for first 334 observations? (total observations: 123*334) and so 
on. 
I need to run regression_1 for first 334 observations, regression_2 
for next 334 obs (from 335 to 669) and so on till regression_123 (from 
last 40748 till 41082). 
And each time I run such regression I would like to save results (summary and mentioned tests). 

Then I would like to repeat the same procedure but for rlm() and lmrob() if possible. 

I think I can write "tests" part of the script alone (could you 
write me some comments where exactly I should put it in script to have 
the test automatically repeated the results saved), but 'saving' and 
'repeating 123 times' procedures are quite complicated for me, at least 
now. So here I am asking for help with it. 

In the end I would like to have three txt (or csv) files: 
one containing 123 "summaries" and test results of lm, 
one containing 123 "summaries" and test results of rlm 
and one containing 123 "summaries" and test results of lmrob. 

Could someone help me with this task? 
I am grateful for your help and support. 

________________ 
*like: 
jarque.bera.test() 
vif() 
ncvTest() 
durbinWatsonTest() 

---some of them are not applicable for rlm and lmrob - so in 
this case I would like to have "test NA" in the three output txt (or 
csv) files 
Some of them are also not applicable to cross-sectional regressions 
... but still I would like to keep them in script for later 
modifications



More information about the R-help mailing list