[R] how to obtain lm statistics for multiple subsets

Cecilia Carmo cecilia.carmo at ua.pt
Mon Jun 15 15:26:09 CEST 2009


Hi everyone, my data is in a dataframe similar to this but 
with more firms, more industries, more years and variables 
that correspond to financial information:

>firm<-c(rep(1,4),rep(2,4),rep(3,4),rep(4,4))
>year<-c(rep(2000:2003,4))
>industry<-c(rep(10,4),rep(20,4),rep(10,4),rep(30,4))
>X1<-c(10,14,18,16,20,45,23,54,24,67,98,58,76,34,23,89)
>X2<-c(11,46,89,36,72,78,55,44,22,78,53,25,12,45,87,23)
>Y<-c(12,45,32,69,87,54,33,22,89,66,35,23,15,54,67,87)
>data<-data.frame(firm, industry,year,Y,X1,X2)
>data
>firm industry year  Y X1 X2
  1       10 2000 12 10 11
1       10 2001 45 14 46
1       10 2002 32 18 89
1       10 2003 69 16 36
2       20 2000 87 20 72
2       20 2001 54 45 78
2       20 2002 33 23 55
2       20 2003 22 54 44
3       10 2000 89 24 22
3       10 2001 66 67 78
  3       10 2002 35 98 53
3       10 2003 23 58 25
4       30 2000 15 76 12
4       30 2001 54 34 45
4       30 2002 67 23 87
4       30 2003 87 89 23

I need to obtain the coefficients and the statistics by 
year and by industry, for the lm function:
ff<-Y~b1 + b2.X1 + b3.X3

So what I’ve done was: subset the dataframe by year, so I 
have 3 dataframes corresponding to the 3 years 
(dataframe2000, dataframe2001, 
 ) and then I applied a 
function that I’ve found in R-help mails:
coef2000<-as.data.frame(t(sapply(split(dataframe2000,dataframe2000$industry),function(x){coef(lm(ff,data=x))})))

I need help in two ways:
First: I’d like to have in the dataframe of the 
coefficients more statistics information that helps me to 
understand the significance of the coefficients; and
Second: Is that possible to obtain this output for all 
years at once?
Thanks in advance,
Cecília Carmo (Universidade de Aveiro – Portugal)




More information about the R-help mailing list