[R] Dealing with large nominal predictor in sem package
John Fox
jfox at mcmaster.ca
Tue Apr 10 03:51:17 CEST 2007
Dear adschai,
> Thank you. I think (2) from your explanation hits the right
> point. The reason is that when I made my own dummy variables
> and my original nominal variable has 10 possible values, it
> makes each each observed exogeneous variable vector of mine
> has 9 zeros and 1 one value. And I have about 400000
> observations. So it will make the matrix almost zero.
>
I'm afraid that I don't follow that, unless you're saying that some of the
levels of the factor have very few observations in them.
> One more question. If I have a nominal response, I guess the
> tsls would no longer work. How can I go around with this?
If the response is ordinal, then you can use sem() with
polyserial/polychoric correlations. Otherwise, the sem package won't handle
it.
> Says, I have 3 equations in my structure model whose
> responses are continuous whereas another one has multinominal
> response. Thank you so much.
>
As I said, neither tsls() nor sem() will handle an unordered response.
John
> - adschai
>
> > It's not possible to know from your description exactly what you're
> > doing, but perhaps the following will help:
> >
> > (1) I presume that your nominal variable is exogenous,
> since otherwise
> > it wouldn't be sensible to use 2SLS.
> >
> > (2) You don't have to make your own dummy regressors for a nominal
> > variable; just represent it in the model as a factor as you would,
> > e.g., in lm().
> >
> > (3) Do you have at least as many instrumental variables
> (including the
> > dummy
> > regressors) as there are structural coefficients to
> estimate? If not,
> > the structural equation is underidentified, which will produce the
> > error that you've encountered.
> >
> > I hope this helps,
> > John
> >
> > >
> > > I am using tsls function from sem package to estimate a
> model which
> > > includes large number of data. Among its predictors, it has a
> > > nominal data which has about 10 possible values. So I expand this
> > > parameter into 9-binary-value predictors with the coefficient of
> > > base value equals 0. I also have another continuous predictor.
> > >
> > > The problem is that, whenever I run the tsls, I will get
> 'System is
> > > computationally singular' error all the time. I'm
> wondering if there
> > > is anyway that I can overcome this problem? Please kindly
> suggest.
> > > Thank you so much in advance.
> > >
> > > - adschai
> > >
