[R] sandwich package: HAC estimators

Wed Jun 1 13:45:14 CEST 2016

On Wed, 1 Jun 2016, T.Riedle wrote:

> Thank you very much. I have applied the example to my case and get 
> following results:
>
> crisis_bubble4<-glm(stock.market.crash~crash.MA+bubble.MA+MP.MA+UTS.MA+UPR.MA+PPI.MA+RV.MA,family=binomial("logit"),data=Data_logitregression_movingaverage)
>> summary(crisis_bubble4)
>
> Call:
> glm(formula = stock.market.crash ~ crash.MA + bubble.MA + MP.MA +
>    UTS.MA + UPR.MA + PPI.MA + RV.MA, family = binomial("logit"),
>    data = Data_logitregression_movingaverage)
>
> Deviance Residuals:
>    Min       1Q   Median       3Q      Max
> -1.7828  -0.6686  -0.3186   0.6497   2.4298
>
> Coefficients:
>             Estimate Std. Error z value Pr(>|z|)
> (Intercept)   -5.2609     0.8927  -5.893 3.79e-09 ***
> crash.MA       0.4922     0.4966   0.991  0.32165
> bubble.MA     12.1287     1.3736   8.830  < 2e-16 ***
> MP.MA        -20.0724    96.9576  -0.207  0.83599
> UTS.MA       -58.1814    19.3533  -3.006  0.00264 **
> UPR.MA      -337.5798    64.3078  -5.249 1.53e-07 ***
> PPI.MA       729.3769    73.0529   9.984  < 2e-16 ***
> RV.MA        116.0011    16.5456   7.011 2.37e-12 ***
> ---
> Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
>
> (Dispersion parameter for binomial family taken to be 1)
>
>    Null deviance: 869.54  on 705  degrees of freedom
> Residual deviance: 606.91  on 698  degrees of freedom
> AIC: 622.91
>
> Number of Fisher Scoring iterations: 5
>
>> coeftest(crisis_bubble4)
>
> z test of coefficients:
>
>              Estimate Std. Error z value  Pr(>|z|)
> (Intercept)   -5.26088    0.89269 -5.8933 3.786e-09 ***
> crash.MA       0.49219    0.49662  0.9911  0.321652
> bubble.MA     12.12868    1.37357  8.8300 < 2.2e-16 ***
> MP.MA        -20.07238   96.95755 -0.2070  0.835992
> UTS.MA       -58.18142   19.35330 -3.0063  0.002645 **
> UPR.MA      -337.57985   64.30779 -5.2494 1.526e-07 ***
> PPI.MA       729.37693   73.05288  9.9842 < 2.2e-16 ***
> RV.MA        116.00106   16.54560  7.0110 2.366e-12 ***
> ---
> Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
>
>> coeftest(crisis_bubble4,vcov=NeweyWest)
>
> z test of coefficients:
>
>              Estimate Std. Error z value Pr(>|z|)
> (Intercept)   -5.26088    5.01706 -1.0486  0.29436
> crash.MA       0.49219    2.41688  0.2036  0.83863
> bubble.MA     12.12868    5.85228  2.0725  0.03822 *
> MP.MA        -20.07238  499.37589 -0.0402  0.96794
> UTS.MA       -58.18142   77.08409 -0.7548  0.45038
> UPR.MA      -337.57985  395.35639 -0.8539  0.39318
> PPI.MA       729.37693  358.60868  2.0339  0.04196 *
> RV.MA        116.00106   79.52421  1.4587  0.14465
> ---
> Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
>
>> waldtest(crisis_bubble4, vcov = NeweyWest,test="F")
> Wald test
>
> Model 1: stock.market.crash ~ crash.MA + bubble.MA + MP.MA + UTS.MA +
>    UPR.MA + PPI.MA + RV.MA
> Model 2: stock.market.crash ~ 1
>  Res.Df Df      F  Pr(>F)
> 1    698
> 2    705 -7 2.3302 0.02351 *
> ---
> Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
>
>> waldtest(crisis_bubble4, vcov = NeweyWest,test="Chisq")
> Wald test
>
> Model 1: stock.market.crash ~ crash.MA + bubble.MA + MP.MA + UTS.MA +
>    UPR.MA + PPI.MA + RV.MA
> Model 2: stock.market.crash ~ 1
>  Res.Df Df  Chisq Pr(>Chisq)
> 1    698
> 2    705 -7 16.311    0.02242 *
> ---
> Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
>
> Do you agree with the methodology?

Well, this is how you _can_ do what you _wanted_ to do. I already 
expressed my doubts about several aspects. First, some coefficients and 
their standard errors are very large which may (or may not) hint at 
problems that are close to separation. Second, given the increase in the 
standard errors, the autocorrelation appears to be substantial and it 
might be good to try to capture these autocorrelations explicitly rather 
than just correcting the standard errors.

> I read in a book that it is also possible to use vcov=vcovHAC in the 
> coeftest() function.

Yes. (I also mentioned that in my e-mail yesterday, see below.)

> Nevertheless, I am not sure what kind of HAC I generate with this 
> command. Which weights does this command apply, which bandwith and which 
> kernel?

Please consult vignette("sandwich", package = "sandwich") for the details. 
In short: Both, vcovHAC and kernHAC use the quadratic spectral kernel with 
Andrews' parametric bandwidth selection. The latter function uses 
prewhitening by default while the latter does not. In contrast, NeweyWest 
uses a Bartlett kernel with Newey & Wests nonparametric lag/bandwidth 
selection and prewhitening by default.

> Kind regards
> ________________________________________
> From: Achim Zeileis <Achim.Zeileis at uibk.ac.at>
> Sent: 31 May 2016 17:19
> To: T.Riedle
> Cc: r-help at r-project.org
> Subject: Re: [R] sandwich package: HAC estimators
>
> On Tue, 31 May 2016, T.Riedle wrote:
>
>> Many thanks for your feedback.
>>
>> If I get the code for the waldtest right I can calculate the Chi2 and
>> the F statistic using waldtest().
>
> Yes. In a logit model you would usually use the chi-squared statistic.
>
>> Can I use the waldtest() without using bread()/ estfun()? That is, I
>> estimate the logit regression using glm() e.g. logit<-glm(...) and
>> insert logit into the waldtest() function.
>>
>> Does that work to get chi2 under HAC standard errors?
>
> I'm not sure what you mean here but I include a worked example. Caveat:
> The data I use are cross-section data with an overly simplified set of
> regressors. So none of this makes sense for the application - but it shows
> how to use the commands.
>
> ## load AER package which provides the example data
> ## and automatically loads "lmtest" and "sandwich"
> library("AER")
> data("PSID1976", package = "AER")
>
> ## fit a simple logit model and obtain marginal Wald tests
> ## for the coefficients and an overall chi-squared statistic
> m <- glm(participation ~ education, data = PSID1976, family = binomial)
> summary(m)
> anova(m, test = "Chisq")
>
> ## replicate the same statistics with coeftest() and lrtest()
> coeftest(m)
> lrtest(m)
>
> ## the likelihood ratio test is asymptotically equivalent
> ## to the Wald test leading to a similar chi-squared test here
> waldtest(m)
>
> ## obtain HAC-corrected (Newey-West) versions of the Wald tests
> coeftest(m, vcov = NeweyWest)
> waldtest(m, vcov = NeweyWest)
>
> Instead of NeweyWest other covariance estimators (e.g., vcovHAC, kernHAC,
> etc.) can also be plugged in.
>
> hth,
> Z
>
>> ________________________________________
>> From: Achim Zeileis <Achim.Zeileis at uibk.ac.at>
>> Sent: 31 May 2016 13:18
>> To: T.Riedle
>> Cc: r-help at r-project.org
>> Subject: Re: [R] sandwich package: HAC estimators
>>
>> On Tue, 31 May 2016, T.Riedle wrote:
>>
>>> I understood. But how do I get the R2 an Chi2 of my logistic regression
>>> under HAC standard errors? I would like to create a table with HAC SE
>>> via e.g. stargazer().
>>>
>>> Do I get these information by using the functions
>>>
>>> bread.lrm <- function(x, ...) vcov(x) * nobs(x)
>>> estfun.lrm <- function(x, ...) residuals(x, "score")?
>>>
>>> Do I need to use the coeftest() in this case?
>>
>> The bread()/estfun() methods enable application of vcovHAC(), kernHAC(),
>> NeweyWest(). This in turn enables the application of coeftest(),
>> waldtest(), or linearHypothesis() with a suitable vcov argument.
>>
>> All of these give you different kinds of Wald tests with HAC covariances
>> including marginal tests of individual coefficients (coeftest) or global
>> tests of nested models (waldtest/linearHypothesis). The latter can serve
>> as replacement for the "chi-squared test". For pseudo-R-squared values I'm
>> not familiar with HAC-adjusted variants.
>>
>> And I'm not sure whether there is a LaTeX export solution that encompasses
>> all of these aspects simultaneously.
>>
>>> ________________________________________
>>> From: R-help <r-help-bounces at r-project.org> on behalf of Achim Zeileis <Achim.Zeileis at uibk.ac.at>
>>> Sent: 31 May 2016 08:36
>>> To: Leonardo Ferreira Fontenelle
>>> Cc: r-help at r-project.org
>>> Subject: Re: [R] sandwich package: HAC estimators
>>>
>>> On Mon, 30 May 2016, Leonardo Ferreira Fontenelle wrote:
>>>
>>>> Em Sáb 28 mai. 2016, às 15:50, Achim Zeileis escreveu:
>>>>> On Sat, 28 May 2016, T.Riedle wrote:
>>>>>> I thought it would be useful to incorporate the HAC consistent
>>>>>> covariance matrix into the logistic regression directly and generate an
>>>>>> output of coefficients and the corresponding standard errors. Is there
>>>>>> such a function in R?
>>>>>
>>>>> Not with HAC standard errors, I think.
>>>>
>>>> Don't glmrob() and summary.glmrob(), from robustbase, do that?
>>>
>>> No, they implement a different concept of robustness. See also
>>> https://CRAN.R-project.org/view=Robust
>>>
>>> glmrob() implements GLMs that are "robust" or rather "resistant" to
>>> outliers and other observations that do not come from the main model
>>> equation. Instead of maximum likelihood (ML) estimation other estimation
>>> techniques (along with corresponding covariances/standard errors) are
>>> used.
>>>
>>> In contrast, the OP asked for HAC standard errors. The motivation for
>>> these is that the main model equation does hold for all observations but
>>> that the observations might be heteroskedastic and/or autocorrelated. In
>>> this situation, ML estimation is still consistent (albeit not efficient)
>>> but the covariance matrix estimate needs to be adjusted.
>>>
>>>>
>>>> Leonardo Ferreira Fontenelle, MD, MPH
>>>>
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>> ______________________________________________
>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>