As an addendum / erratum to my original post, the second block of code should read for completeness:
set.seed(02102020) N=500 M=10 rater=rep(1:M, each = N) lead_n=as.factor(rep(1:N,M)) a=rep(rnorm(N),M) z=rep(round(25+2*rnorm(N)+.2*a)) x=a+rnorm(N*M) y=.5*x+5*a-.5*z+2*rnorm(N*M) x_cl=rep(aggregate(x,list(lead_n) mean)[,2],M) model=lm(y~x+x_cl+z) summary(model) y=1+1.5*x+4.6*x_cl-0.5*z x.mat=cbind(rep(1,length(y)),x,x_cl,z) ls.print(lsfit(x.mat,y,intercept=FALSE)) M=list(y=y, w=rep(1, length(y)), X=x.mat, C=matrix(0,0,0), p=rep(1, ncol(x.mat)), off=array(0,0), S=list(), sp=array(0,0), Ain=diag(ncol(x.mat)), bin=rep(0, ncol(x.mat)) ) pcls(M) However, all my questions stand. Ta, Clive On Tue, 3 Nov 2020 at 01:14, Clive Nicholas <cliveli...@googlemail.com> wrote: > Hello all, > > I'll level with you: I'm puzzled! > > How is it that this constrained regression routine using -pcls- runs > satisfactorily (courtesy of Tian Zheng): > > library(mgcv) > options(digits=3) > x.1=rnorm(100, 0, 1) > x.2=rnorm(100, 0, 1) > x.3=rnorm(100, 0, 1) > x.4=rnorm(100, 0, 1) > y=1+0.5*x.1-0.2*x.2+0.3*x.3+0.1*x.4+rnorm(100, 0, 0.01) > x.mat=cbind(rep(1, length(y)), x.1, x.2, x.3, x.4) > ls.print(lsfit(x.mat, y, intercept=FALSE)) > M=list(y=y, > w=rep(1, length(y)), > X=x.mat, > C=matrix(0,0,0), > p=rep(1, ncol(x.mat)), > off=array(0,0), > S=list(), > sp=array(0,0), > Ain=diag(ncol(x.mat)), > bin=rep(0, ncol(x.mat)) ) > pcls(M) > Residual Standard Error=0.0095 > R-Square=1 > F-statistic (df=5, 95)=314735 > p-value=0 > > Estimate Std.Err t-value Pr(>|t|) > 1.000 0.0010 1043.9 0 > x.1 0.501 0.0010 512.6 0 > x.2 -0.202 0.0009 -231.6 0 > x.3 0.298 0.0010 297.8 0 > x.4 0.103 0.0011 94.8 0 > > but this one does not for a panel dataset: > > set.seed(02102020) > N=500 > M=10 > rater=rep(1:M, each = N) > lead_n=as.factor(rep(1:N,M)) > a=rep(rnorm(N),M) > z=rep(round(25+2*rnorm(N)+.2*a)) > x=a+rnorm(N*M) > y=.5*x+5*a-.5*z+2*rnorm(N*M) > x_cl=rep(aggregate(x,list(lead_n) mean)[,2],M) > model=lm(y~x+x_cl+z) > summary(model) > y=1+1.5*x+4.6*x_cl-0.5*z > x.mat=cbind(rep(1,length(y)),x,x_cl,z) > ls.print(lsfit(x.mat,y,intercept=FALSE)) > > Residual Standard Error=0 > R-Square=1 > F-statistic (df=4, 4996)=5.06e+30 > p-value=0 > > Estimate Std.Err t-value Pr(>|t|) > 1.0 0 2.89e+13 0 > x 0.8 0 2.71e+14 0 > x_cl 4.6 0 1.18e+15 0 > z -0.5 0 -3.63e+14 0 > > ? > > There shouldn't be anything wrong with the second set of data, unless I've > missed something obvious (that constraints don't work for panel data? Seems > unlikely to me)! > > Also: > > (1) I'm ultimately looking just to constrain ONE coefficient whilst > allowing the other coefficients to be unconstrained (I tried this with the > first dataset by setting > > y=1+0.5*x.1-x.2+x.3+x.4 > > in the call, but got similar-looking output to what I got in the second > dataset); and > > (2) it would be really useful to have the call to -pcls(M)- produce more > informative output (SEs, t-values, fit stats, etc). > > Many thanks in anticipation of your expert help and being told what a > clueless berk I am, > Clive > > -- > Clive Nicholas > > "My colleagues in the social sciences talk a great deal about methodology. > I prefer to call it style." -- Freeman J. Dyson > -- Clive Nicholas "My colleagues in the social sciences talk a great deal about methodology. I prefer to call it style." -- Freeman J. Dyson [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.