[R] AER ivreg diagnostics: question on DF of Sargan test
Achim Zeileis
Achim.Zeileis at uibk.ac.at
Thu Nov 7 19:07:21 CET 2013
Hélène,
thanks for spotting this! This is a bug in "AER". I had just tested the
new diagnostics for regressions with 1 endogenous variable and hence
never noticed the problem. But if there are > 1 endogenous variables, the
df used in ivreg() (and hence the associated p-values) are too large.
I've fixed the problem in AER's devel-version and will release it on CRAN
in the next days.
Thanks & best regards,
Z
On Thu, 7 Nov 2013, Hélène Huber-Yahi wrote:
> Hello,
> I'm new to R and I'm currently learning to use package AER, which is
> extremely comprehensive and useful. I have one question related to the
> diagnostics after ivreg: if I understood well, the Sargan test provided
> states that the statistic should follow a Chi squared of degrees of freedom
> equal to the number of excluded instruments minus one. But I read many
> times that the degrees of freedom of this statistic is supposed to equal
> the number of overidentifying restrictions, i.e. the number of excluded
> instruments minus the number of endogenous variables tested. When comparing
> with Stata results (estat overid after ivreg, same with ivreg2 output), the
> statistic is the same as the one provided by R, only the p-value changes
> because the distribution chosen is different. Is this command using a
> different flavor of the Sargan test ? I did not find the details in the AER
> pdf.
> I'm using Rstudio with R 3.0.2 (Windows 7) and AER is up to date. The
> output I get from R is the following, where the Sargan DF is equal to 5,
> while I thought it would be equal to 6-3=3. The data comes from Verbeek's
> econometrics textbook and the example replicates the one in the book.
> Dependent variable is log of wage, endogenous variables are education,
> experience and its square (3 of them), excluded instruments are parents'
> education etc (6 of them).
>
>> ivmodel <- ivreg(lwage76 ~ ed76 + exp76 + exp762 + black + smsa76 + south76 | daded + momed + libcrd14 + age76 + age762 + nearc4 + black + smsa76 + south76,+ data = school)> > summary(ivmodel,diagnostics=TRUE)
> Call:
> ivreg(formula = lwage76 ~ ed76 + exp76 + exp762 + black + smsa76 +
> south76 | daded + momed + libcrd14 + age76 + age762 + nearc4 +
> black + smsa76 + south76, data = school)
>
> Residuals:
> Min 1Q Median 3Q Max
> -1.63375 -0.22253 0.02403 0.24350 1.32911
>
> Coefficients:
> Estimate Std. Error t value Pr(>|t|)
> (Intercept) 4.6064811 0.1126195 40.903 < 2e-16 ***
> ed76 0.0848507 0.0066061 12.844 < 2e-16 ***
> exp76 0.0796432 0.0164406 4.844 1.34e-06 ***
> exp762 -0.0020376 0.0008257 -2.468 0.0136 *
> black -0.1726723 0.0195231 -8.845 < 2e-16 ***
> smsa76 0.1521693 0.0165207 9.211 < 2e-16 ***
> south76 -0.1204765 0.0154904 -7.778 1.01e-14 ***
>
> Diagnostic tests:
> df1 df2 statistic p-value
> Weak instruments 6 2987 965.450 <2e-16 ***
> Wu-Hausman 2 2988 1.949 0.143
> Sargan 5 NA 3.868 0.569
> ---
> Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
>
> Residual standard error: 0.3753 on 2990 degrees of freedom
> Multiple R-Squared: 0.2868, Adjusted R-squared: 0.2854
> Wald test: 178.6 on 6 and 2990 DF, p-value: < 2.2e-16
>
>
> Would this be caused by the fact that I'm using 2SLS and not GMM (at least
> I suppose) to estimate the IV model ? I apologize if this comes from a
> misunderstanding from my part, and I thank you in advance for your help.
>
> Best,
>
> H. Huber
>
> [[alternative HTML version deleted]]
>
>
More information about the R-help
mailing list