[R] Query on constrained regressions using -mgcv- and -pcls-
Clive Nicholas
c||ve||@t@ @end|ng |rom goog|em@||@com
Tue Nov 3 02:14:46 CET 2020
Hello all,
I'll level with you: I'm puzzled!
How is it that this constrained regression routine using -pcls- runs
satisfactorily (courtesy of Tian Zheng):
library(mgcv)
options(digits=3)
x.1=rnorm(100, 0, 1)
x.2=rnorm(100, 0, 1)
x.3=rnorm(100, 0, 1)
x.4=rnorm(100, 0, 1)
y=1+0.5*x.1-0.2*x.2+0.3*x.3+0.1*x.4+rnorm(100, 0, 0.01)
x.mat=cbind(rep(1, length(y)), x.1, x.2, x.3, x.4)
ls.print(lsfit(x.mat, y, intercept=FALSE))
M=list(y=y,
w=rep(1, length(y)),
X=x.mat,
C=matrix(0,0,0),
p=rep(1, ncol(x.mat)),
off=array(0,0),
S=list(),
sp=array(0,0),
Ain=diag(ncol(x.mat)),
bin=rep(0, ncol(x.mat)) )
pcls(M)
Residual Standard Error=0.0095
R-Square=1
F-statistic (df=5, 95)=314735
p-value=0
Estimate Std.Err t-value Pr(>|t|)
1.000 0.0010 1043.9 0
x.1 0.501 0.0010 512.6 0
x.2 -0.202 0.0009 -231.6 0
x.3 0.298 0.0010 297.8 0
x.4 0.103 0.0011 94.8 0
but this one does not for a panel dataset:
set.seed(02102020)
N=500
M=10
rater=rep(1:M, each = N)
lead_n=as.factor(rep(1:N,M))
a=rep(rnorm(N),M)
z=rep(round(25+2*rnorm(N)+.2*a))
x=a+rnorm(N*M)
y=.5*x+5*a-.5*z+2*rnorm(N*M)
x_cl=rep(aggregate(x,list(lead_n) mean)[,2],M)
model=lm(y~x+x_cl+z)
summary(model)
y=1+1.5*x+4.6*x_cl-0.5*z
x.mat=cbind(rep(1,length(y)),x,x_cl,z)
ls.print(lsfit(x.mat,y,intercept=FALSE))
Residual Standard Error=0
R-Square=1
F-statistic (df=4, 4996)=5.06e+30
p-value=0
Estimate Std.Err t-value Pr(>|t|)
1.0 0 2.89e+13 0
x 0.8 0 2.71e+14 0
x_cl 4.6 0 1.18e+15 0
z -0.5 0 -3.63e+14 0
?
There shouldn't be anything wrong with the second set of data, unless I've
missed something obvious (that constraints don't work for panel data? Seems
unlikely to me)!
Also:
(1) I'm ultimately looking just to constrain ONE coefficient whilst
allowing the other coefficients to be unconstrained (I tried this with the
first dataset by setting
y=1+0.5*x.1-x.2+x.3+x.4
in the call, but got similar-looking output to what I got in the second
dataset); and
(2) it would be really useful to have the call to -pcls(M)- produce more
informative output (SEs, t-values, fit stats, etc).
Many thanks in anticipation of your expert help and being told what a
clueless berk I am,
Clive
--
Clive Nicholas
"My colleagues in the social sciences talk a great deal about methodology.
I prefer to call it style." -- Freeman J. Dyson
[[alternative HTML version deleted]]
More information about the R-help
mailing list