[R] non-linear estimation with many firm-specific parameters
ivo welch
ivowel at gmail.com
Sun May 9 22:14:17 CEST 2010
Dear R experts---
I doubt that someone has already solved my problem, but I thought I
would ask quickly, just in case someone has.
Let' say I start with a (flattened panel) model that says
y[i] = x[i] + b*(T-x[i])
easy enough---this is just a linear model. I could also make this a
fixed-effects model if I change T to T[fmid], where fmid is the firm's
id. I know I can do this faster, but logically, what I want to
estimate is lm( y ~ as.factor(fmid) + x ). I have about 100,000
observations, and about 10,000 firm ids.
now, let me move to a world in which b is a function of the distance
between T and x, b= a+c*(T-x[i])^2
y[i] = x[i] + b(T,x[i]) * (T-x[i]) = x[i] + (a+c*(T-x[i])^2) * (T-x[i])
R solves this nicely with the nls() function in about 5 seconds. The
result are estimates for a, c, and T.
here comes the hard part. I want to make the T again a function of
each firm, i.e., T[fmid]. in a sense, I want
y[i] = x[i] + (a+c*(T[fmid]-x[i])^2) * (T[fmid] - x[i])
where the firm-specific constants are supposed to be the same in the
two terms (i.e., not the permutative set). the usual trick to speed
up fixed-effects estimations (i.e., subtracting out the means) does
not work here, because the problem is non-linear. I am thinking about
expanding the dummies into an appropriate matrix, then coding my
problem into an objective function, and letting R optimize over my,
ahem, 10,000 or so T[i], a, and b. I fear that this would not only
overwhelm my CPU (taking a few days, which would be ok), but overwhelm
my memory, too. maybe it is just plain infeasible.
has anyone seen someone else work on such a problem?
sincerely,
/ivo welch
----
Ivo Welch (ivo.welch at brown.edu, ivo.welch at gmail.com)
More information about the R-help
mailing list