[R] Manual two-way demeaning of unbalanced panel data (Wansbeek/Kapteyn transformation)

Philipp Grueber philipp.grueber at ebs.edu
Fri Jan 11 03:20:11 CET 2013


Dear R users,

I wish to manually demean a panel over time and entities. I tried to code
the Wansbeek and Kapteyn (1989) transformation (from Baltagi's book Ch. 9).

As a benchmark I use both the pmodel.response() and model.matrix() functions
in package plm and the results from using dummy variables. As far as I
understood the transformation (Ch.3), Q%*%y (with y being the dependent
variable) should yield the demeaned series.

However, ...

		...I find that the results do not match, if I do so.
		...that if I am looking at a balanced panel, I get the correct results
when multiplying Q with the already demeaned series y_it, Q%*%y_it.
		...that if I am looking at an unbalanced panel, I receive results which
differ (even though being close). 

I guess I am missing something. Every comment pointing me to the right
solution would be of great value to me. Also, comments on how to increase
the efficiency of my function would help!

Please find an example based on the Grunfeld data below.

Thank you very much!
Philipp Grueber



##########################
library(MASS)
getQ<-function(object,t.index,i.index){

	t.ind<-object[,t.index]
	i.ind<-object[,i.index]
	nam<-unique(i.ind)
	tim<-unique(t.ind)
	n<-nrow(object)
	N<-length(nam)
	T<-length(tim)
	I<-matrix(0,n,n)
	I[row(I)==col(I)] <- 1
	I_N<-matrix(0,N,N)
	I_N[row(I_N)==col(I_N)] <- 1
	I_T<-matrix(0,T,T)
	I_T[row(I_T)==col(I_T)] <- 1

	D1<-data.frame()
	for(t in tim){
		D1<-rbind(D1,I_N)
	}
	D1<-data.matrix(D1[as.vector(table(i.ind,t.ind))==1,])

	D2<-data.frame()
	for(i in nam){
		D2<-rbind(D2,I_T)
	}
	D2<-data.matrix(D2[as.vector(table(t.ind,i.ind))==1,])
	D<-data.matrix(cbind(D1,D2))

Q<-I-D%*%ginv(crossprod(D))%*%t(D)#fQ(D1)-fQ(D1)%*%D2%*%ginv(t(D2)%*%fQ(D1)%*%D2)%*%t(D2)%*%fQ(D1)
	Q
}
##############################

  
library(plm)
library(lmtest)
data(Grunfeld)


y_i<-Grunfeld$inv-ave(Grunfeld$inv,index=Grunfeld$firm)
y_t<-Grunfeld$inv-ave(Grunfeld$inv,index=Grunfeld$year)
y_it<-(Grunfeld$inv-ave(Grunfeld$inv,index=Grunfeld$firm)-ave(Grunfeld$inv,index=Grunfeld$year)+rep(mean(Grunfeld$inv),length(Grunfeld$inv)))
x1_it<-(Grunfeld$value-ave(Grunfeld$value,index=Grunfeld$firm)-ave(Grunfeld$value,index=Grunfeld$year)+rep(mean(Grunfeld$value),length(Grunfeld$inv)))

dem_y_i<-pmodel.response(plm(formula=inv~value,data=Grunfeld,index=c("firm","year"),model="within",effect="individual"))
dem_y_t<-pmodel.response(plm(formula=inv~value,data=Grunfeld,index=c("firm","year"),model="within",effect="time"))
dem_y_it<-pmodel.response(plm(formula=inv~value,data=Grunfeld,index=c("firm","year"),model="within",effect="twoways"))
dem_X_it<-model.matrix(plm(formula=inv~value,data=Grunfeld,index=c("firm","year"),model="within",effect="twoways"))

sum(y_i!=dem_y_i)
#y_i[1:10]
#dem_y_i[1:10]

sum(y_t!=dem_y_t)
#y_t[1:10]
#dem_y_t[1:10]

sum(y_it!=dem_y_it)
#y_it[1:10]
#dem_y_it[1:10]
sum(x1_it!=dem_X_it)
#x1_it[1:10]
#dem_X_it[1:10,]

(getQ(Grunfeld,t.index="year",i.index="firm")%*%Grunfeld$inv)[1:10]
(getQ(Grunfeld,t.index="year",i.index="firm")%*%y_it)[1:10]
dem_y_it[1:10]

############################

Grunfeld1<-Grunfeld[-1,]

sum(ave(Grunfeld1$inv,index=Grunfeld1$firm)!=(tapply(Grunfeld1$inv,Grunfeld1$firm,function(x){mean(x)})[Grunfeld1$firm]))==0

y_i<-Grunfeld1$inv-ave(Grunfeld1$inv,index=Grunfeld1$firm)
y_t<-Grunfeld1$inv-ave(Grunfeld1$inv,index=Grunfeld1$year)
y_it<-(Grunfeld1$inv-ave(Grunfeld1$inv,index=Grunfeld1$firm)-ave(Grunfeld1$inv,index=Grunfeld1$year)+rep(mean(Grunfeld1$inv),length(Grunfeld1$inv)))
x1_it<-(Grunfeld1$value-ave(Grunfeld1$value,index=Grunfeld1$firm)-ave(Grunfeld1$value,index=Grunfeld1$year)+rep(mean(Grunfeld1$value),length(Grunfeld1$inv)))

dem_y_i<-pmodel.response(plm(formula=inv~value,data=Grunfeld1,index=c("firm","year"),model="within",effect="individual"))
dem_y_t<-pmodel.response(plm(formula=inv~value,data=Grunfeld1,index=c("firm","year"),model="within",effect="time"))
dem_y_it<-pmodel.response(plm(formula=inv~value,data=Grunfeld1,index=c("firm","year"),model="within",effect="twoways"))
dem_X_it<-model.matrix(plm(formula=inv~value,data=Grunfeld1,index=c("firm","year"),model="within",effect="twoways"))

sum(y_i!=dem_y_i)
#y_i[1:10]
#dem_y_i[1:10]

sum(y_t!=dem_y_t)
#y_t[1:10]
#dem_y_t[1:10]

sum(y_it!=dem_y_it)
#y_it[1:10]
#dem_y_it[1:10]
#y_it[1:10]-dem_y_it[1:10]

sum(x1_it!=dem_X_it)
#x1_it[1:10]
#dem_X_it[1:10,]
#x1_it[1:10]-dem_X_it[1:10,]

(getQ(Grunfeld1,t.index="year",i.index="firm")%*%Grunfeld1$inv)[1:10]
(getQ(Grunfeld1,t.index="year",i.index="firm")%*%y_it)[1:10]
dem_y_it[1:10]








-----
____________________________________
EBS Universitaet fuer Wirtschaft und Recht
FARE Department
Wiesbaden/ Germany
http://www.ebs.edu/index.php?id=finacc&L=0
--
View this message in context: http://r.789695.n4.nabble.com/Manual-two-way-demeaning-of-unbalanced-panel-data-Wansbeek-Kapteyn-transformation-tp4655202.html
Sent from the R help mailing list archive at Nabble.com.




More information about the R-help mailing list