Dear R users,

I wish to manually demean a panel over time and entities. I tried to code
the Wansbeek and Kapteyn (1989) transformation (from Baltagi's book Ch. 9).

As a benchmark I use both the pmodel.response() and model.matrix() functions
in package plm and the results from using dummy variables. As far as I
understood the transformation (Ch.3), Q%*%y (with y being the dependent
variable) should yield the demeaned series.

However, ...

                ...I find that the results do not match, if I do so.
                ...that if I am looking at a balanced panel, I get the correct 
results
when multiplying Q with the already demeaned series y_it, Q%*%y_it.
                ...that if I am looking at an unbalanced panel, I receive 
results which
differ (even though being close). 

I guess I am missing something. Every comment pointing me to the right
solution would be of great value to me. Also, comments on how to increase
the efficiency of my function would help!

Please find an example based on the Grunfeld data below.

Thank you very much!
Philipp Grueber



##########################
library(MASS)
getQ<-function(object,t.index,i.index){

        t.ind<-object[,t.index]
        i.ind<-object[,i.index]
        nam<-unique(i.ind)
        tim<-unique(t.ind)
        n<-nrow(object)
        N<-length(nam)
        T<-length(tim)
        I<-matrix(0,n,n)
        I[row(I)==col(I)] <- 1
        I_N<-matrix(0,N,N)
        I_N[row(I_N)==col(I_N)] <- 1
        I_T<-matrix(0,T,T)
        I_T[row(I_T)==col(I_T)] <- 1

        D1<-data.frame()
        for(t in tim){
                D1<-rbind(D1,I_N)
        }
        D1<-data.matrix(D1[as.vector(table(i.ind,t.ind))==1,])

        D2<-data.frame()
        for(i in nam){
                D2<-rbind(D2,I_T)
        }
        D2<-data.matrix(D2[as.vector(table(t.ind,i.ind))==1,])
        D<-data.matrix(cbind(D1,D2))

Q<-I-D%*%ginv(crossprod(D))%*%t(D)#fQ(D1)-fQ(D1)%*%D2%*%ginv(t(D2)%*%fQ(D1)%*%D2)%*%t(D2)%*%fQ(D1)
        Q
}
##############################

  
library(plm)
library(lmtest)
data(Grunfeld)


y_i<-Grunfeld$inv-ave(Grunfeld$inv,index=Grunfeld$firm)
y_t<-Grunfeld$inv-ave(Grunfeld$inv,index=Grunfeld$year)
y_it<-(Grunfeld$inv-ave(Grunfeld$inv,index=Grunfeld$firm)-ave(Grunfeld$inv,index=Grunfeld$year)+rep(mean(Grunfeld$inv),length(Grunfeld$inv)))
x1_it<-(Grunfeld$value-ave(Grunfeld$value,index=Grunfeld$firm)-ave(Grunfeld$value,index=Grunfeld$year)+rep(mean(Grunfeld$value),length(Grunfeld$inv)))

dem_y_i<-pmodel.response(plm(formula=inv~value,data=Grunfeld,index=c("firm","year"),model="within",effect="individual"))
dem_y_t<-pmodel.response(plm(formula=inv~value,data=Grunfeld,index=c("firm","year"),model="within",effect="time"))
dem_y_it<-pmodel.response(plm(formula=inv~value,data=Grunfeld,index=c("firm","year"),model="within",effect="twoways"))
dem_X_it<-model.matrix(plm(formula=inv~value,data=Grunfeld,index=c("firm","year"),model="within",effect="twoways"))

sum(y_i!=dem_y_i)
#y_i[1:10]
#dem_y_i[1:10]

sum(y_t!=dem_y_t)
#y_t[1:10]
#dem_y_t[1:10]

sum(y_it!=dem_y_it)
#y_it[1:10]
#dem_y_it[1:10]
sum(x1_it!=dem_X_it)
#x1_it[1:10]
#dem_X_it[1:10,]

(getQ(Grunfeld,t.index="year",i.index="firm")%*%Grunfeld$inv)[1:10]
(getQ(Grunfeld,t.index="year",i.index="firm")%*%y_it)[1:10]
dem_y_it[1:10]

############################

Grunfeld1<-Grunfeld[-1,]

sum(ave(Grunfeld1$inv,index=Grunfeld1$firm)!=(tapply(Grunfeld1$inv,Grunfeld1$firm,function(x){mean(x)})[Grunfeld1$firm]))==0

y_i<-Grunfeld1$inv-ave(Grunfeld1$inv,index=Grunfeld1$firm)
y_t<-Grunfeld1$inv-ave(Grunfeld1$inv,index=Grunfeld1$year)
y_it<-(Grunfeld1$inv-ave(Grunfeld1$inv,index=Grunfeld1$firm)-ave(Grunfeld1$inv,index=Grunfeld1$year)+rep(mean(Grunfeld1$inv),length(Grunfeld1$inv)))
x1_it<-(Grunfeld1$value-ave(Grunfeld1$value,index=Grunfeld1$firm)-ave(Grunfeld1$value,index=Grunfeld1$year)+rep(mean(Grunfeld1$value),length(Grunfeld1$inv)))

dem_y_i<-pmodel.response(plm(formula=inv~value,data=Grunfeld1,index=c("firm","year"),model="within",effect="individual"))
dem_y_t<-pmodel.response(plm(formula=inv~value,data=Grunfeld1,index=c("firm","year"),model="within",effect="time"))
dem_y_it<-pmodel.response(plm(formula=inv~value,data=Grunfeld1,index=c("firm","year"),model="within",effect="twoways"))
dem_X_it<-model.matrix(plm(formula=inv~value,data=Grunfeld1,index=c("firm","year"),model="within",effect="twoways"))

sum(y_i!=dem_y_i)
#y_i[1:10]
#dem_y_i[1:10]

sum(y_t!=dem_y_t)
#y_t[1:10]
#dem_y_t[1:10]

sum(y_it!=dem_y_it)
#y_it[1:10]
#dem_y_it[1:10]
#y_it[1:10]-dem_y_it[1:10]

sum(x1_it!=dem_X_it)
#x1_it[1:10]
#dem_X_it[1:10,]
#x1_it[1:10]-dem_X_it[1:10,]

(getQ(Grunfeld1,t.index="year",i.index="firm")%*%Grunfeld1$inv)[1:10]
(getQ(Grunfeld1,t.index="year",i.index="firm")%*%y_it)[1:10]
dem_y_it[1:10]








-----
____________________________________
EBS Universitaet fuer Wirtschaft und Recht
FARE Department
Wiesbaden/ Germany
http://www.ebs.edu/index.php?id=finacc&L=0
--
View this message in context: 
http://r.789695.n4.nabble.com/Manual-two-way-demeaning-of-unbalanced-panel-data-Wansbeek-Kapteyn-transformation-tp4655202.html
Sent from the R help mailing list archive at Nabble.com.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to