On 3/17/10, Achim Zeileis <achim.zeil...@uibk.ac.at> wrote: > Hmm, that sounds strange. Maybe something about the data pre-processing > went wrong? > I traced plm() in step-by-step mode, and the process stalls on plm.fit(), apparently after all the pre-processing.
> Depending on how unbalanced the data is, there might not be > enough observations. > It is very unbalanced. > length(unique(kldall.sync$cusip6[])) ##nr of individuals [1] 3079 > summary(x1); sum(x1==1) ##distribution of T Min. 1st Qu. Median Mean 3rd Qu. Max. 1.00 2.00 4.00 4.32 5.00 16.00 [1] 560 But this doesn't seem to be an issue for a heavily unbalanced Grunfeld: > data("Grunfeld", package = "AER") > gr <- subset(Grunfeld[-c(1:19), ], firm %in% c("General Electric", "General > Motors", "IBM")) > dim(gr); head(gr) [1] 41 5 invest value capital firm year 20 1486.7 5593.6 2226.3 General Motors 1954 41 33.1 1170.6 97.8 General Electric 1935 42 45.0 2015.8 104.4 General Electric 1936 43 77.2 2803.3 118.0 General Electric 1937 44 44.6 2039.7 156.2 General Electric 1938 45 48.1 2256.2 172.6 General Electric 1939 > pgr <- plm.data(gr, index = c("firm", "year")) > gr_fe <- plm(invest ~ value + capital, data = pgr, model = "within", + effect = "individual") > gr_fe <- plm(invest ~ value + capital, data = pgr, model = "within", + effect = "time") > gr_fe <- plm(invest ~ value + capital, data = pgr, model = "within", + effect = "twoways") > summary(gr_fe) Twoways effects Within Model Call: plm(formula = invest ~ value + capital, data = pgr, effect = "twoways", model = "within") Unbalanced Panel: n=3, T=1-20, N=41 Residuals : Min. 1st Qu. Median 3rd Qu. Max. -3.01e+01 -7.64e+00 -3.60e-15 7.64e+00 3.01e+01 Coefficients : Estimate Std. Error t-value Pr(>|t|) value 0.0167 0.0165 1.01 0.32 capital 0.0468 0.0367 1.27 0.22 Total Sum of Squares: 6850 Residual Sum of Squares: 6150 F-statistic: 0.959136 on 2 and 17 DF, p-value: 0.403 > Does the lm() version of the "twoways" model work ok? > It works when controlling only for time, but doesn't work when controlling only for individuals or for both (I actually kill the process after 10-15 min). I guess that here the following applies: "in cases where there are many individuals in the sample and we are not interested in the value of their fixed effects, the lm() results are awkward to deal with and the estimation of a large number of ui coefficients could render the problem numerically intractable." [1] [1] http://cran.r-project.org/doc/contrib/Farnsworth-EconometricsInR.pdf The alternative to dummy controls is time-demeaning of the data, and according to the "plm" vignette this is the implementation for "individual" and "time" cases. I am wondering, though, if "twoways" uses (or can use?) the same implementation. One way to work around the failing lm() "twoways" call is to use plm(..., effect="individual") and manually include the time effect. > gr_fe2 <- plm(invest ~ value + capital + year, data = pgr, + model = "within", effect="individual") > summary(gr_fe2) Oneway (individual) effect Within Model Call: plm(formula = invest ~ value + capital + year, data = pgr, effect = "individual", model = "within") Unbalanced Panel: n=3, T=1-20, N=41 Residuals : Min. 1st Qu. Median 3rd Qu. Max. -3.01e+01 -7.64e+00 -4.06e-15 7.64e+00 3.01e+01 Coefficients : Estimate Std. Error t-value Pr(>|t|) value 0.0167 0.0165 1.01 0.325 capital 0.0468 0.0367 1.27 0.220 year1936 1.2031 20.3416 0.06 0.954 [..] year1954 92.5546 37.0794 2.50 0.023 * --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Total Sum of Squares: 68100 Residual Sum of Squares: 6150 F-statistic: 8.14666 on 21 and 17 DF, p-value: 0.0000286 > If so, I guess you will have to try to find out whether it's your > preparation of the data or the fault of plm() that it does not work. > This mix-up replicates fine the coefficients of plm(..., effect="twoways"), but reports an unexpected F-statistic. Since the mixed-up specification---plm(..., effect="individual") & year regressor---works fine on my data, I would suspect that my data is OK and that the plm(..., effect="twoways") implementation falters somewhere. > And you want fixed effects for all >2000 individuals? > No, I don't think so. I am not sure how orthodox this is, but we are only looking at the coefficients of the "main" regressors, while controlling for time and individual variation. > waldtest() from "lmtest" does work in this context. Furthermore, "plm" > provides various specialized tests for certain test problems. > Finally, it works! > gr_fe1 <- plm(invest ~ value + capital, data = pgr, + model = "within", effect="twoways") > summary(gr_fe1)$fstatistic F test data: invest ~ value + capital F = 0.9591, df1 = 2, df2 = 17, p-value = 0.403 > gr_fe2 <- plm(invest ~ value + capital + year, data = pgr, + model = "within", effect="individual") > summary(gr_fe2)$fstatistic ##"incorrect" F test data: invest ~ value + capital + year F = 8.1467, df1 = 21, df2 = 17, p-value = 0.00002857 > gr_fe2_null <- plm(invest ~ year, data = pgr, model = "within") > waldtest(gr_fe2_null, gr_fe2, test="F") ##works! Wald test Model 1: invest ~ year Model 2: invest ~ value + capital + year Res.Df Df F Pr(>F) 1 19 2 17 2 0.96 0.4 Thanks again Liviu ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.