[R-SIG-Finance] R: [Fwd: R-SIG-Finance Digest, Vol 60, Issue 18]
Steven Archambault
archstevej at gmail.com
Mon May 18 22:06:31 CEST 2009
I just realized I used Robust in my Stata 9.2 analysis. When I remove this,
the Chi-sq values are much closer to the values I get in R (but negative, as
the consistent model must be listed first in a chi-sq calculation). However,
with my own data I do get this positive definite error in Stata. Is this a
result of unbalanced data? R doesn't give an error, so I am inclined to
ignore it in Stata. I am posting my own results from R and Stata, and
attaching the data as a csv.
Thanks, hope I am not wasting too much of your time here.
-Steve
###R-Output###
> library("plm")
>
> fdi <- read.csv("C:/data/mydata.csv", na.strings=".")
> fdiplm<-plm.data(fdi, index = c("id_code_id", "year"))
series are constants and have been removed
>
> fdi_test<-(lfdi_2000~ lagdlfdi+ laglnstock2000+ lagtradegdp +lagdlgdp)
>
> fdi_test_fe <- plm(fdi_test, data=fdiplm, model="within")
> fdi_test_re <- plm(fdi_test, data=fdiplm, model="random")
>
> summary (fdi_test_fe)
Oneway (individual) effect Within Model
Call:
plm(formula = fdi_test, data = fdiplm, model = "within")
Unbalanced Panel: n=149, T=3-27, N=2697
Residuals :
Min. 1st Qu. Median 3rd Qu. Max.
-8.2100 -0.4760 0.0452 0.5670 4.8700
Coefficients :
Estimate Std. Error t-value Pr(>|t|)
lagdlfdi 0.1564759 0.0180645 8.6621 < 2.2e-16 ***
laglnstock2000 0.7621350 0.0246798 30.8809 < 2.2e-16 ***
lagtradegdp 0.0178568 0.0025859 6.9055 5.003e-12 ***
lagdlgdp 0.2601477 0.0427744 6.0818 1.188e-09 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Total Sum of Squares: 4606.7
Residual Sum of Squares: 2938
F-statistic: 361.237 on 4 and 2544 DF, p-value: < 2.22e-16
> summary (fdi_test_re)
Oneway (individual) effect Random Effect Model
(Swamy-Arora's transformation)
Call:
plm(formula = fdi_test, data = fdiplm, model = "random")
Unbalanced Panel: n=149, T=3-27, N=2697
Effects:
var std.dev share
idiosyncratic 1.15487 1.07465 0.6617
individual 0.59044 0.76840 0.3383
theta :
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.3718 0.6700 0.7081 0.6955 0.7355 0.7401
Residuals :
Min. 1st Qu. Median Mean 3rd Qu. Max.
-9.15000 -0.47900 0.07270 -0.00713 0.59800 3.95000
Coefficients :
Estimate Std. Error t-value Pr(>|t|)
(Intercept) 16.7744214 0.1552868 108.0222 < 2.2e-16 ***
lagdlfdi 0.1632388 0.0181005 9.0185 < 2.2e-16 ***
laglnstock2000 0.8314432 0.0196444 42.3247 < 2.2e-16 ***
lagtradegdp 0.0119453 0.0020737 5.7605 8.386e-09 ***
lagdlgdp 0.2558009 0.0424599 6.0245 1.696e-09 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Total Sum of Squares: 9522.3
Residual Sum of Squares: 3140.8
F-statistic: 1367.42 on 4 and 2692 DF, p-value: < 2.22e-16
>
> phtest(fdi_test_re, fdi_test_fe)
Hausman Test
data: fdi_test
chisq = 23.7021, df = 4, p-value = 9.164e-05
alternative hypothesis: one model is inconsistent
###end R output###
###Stata 9.2 Output--canned###
xtreg lfdi_2000 lagdlfdi laglnstock2000 lagtradegdp lagdlgdp, fe;
Fixed-effects (within) regression Number of obs =
2697
Group variable (i): id_code_id Number of groups =
149
R-sq: within = 0.3622 Obs per group: min =
3
between = 0.8234 avg =
18.1
overall = 0.6998 max =
27
F(4,2544) =
361.24
corr(u_i, Xb) = 0.3536 Prob > F =
0.0000
------------------------------------------------------------------------------
lfdi_2000 | Coef. Std. Err. t P>|t| [95% Conf.
Interval]
-------------+----------------------------------------------------------------
lagdlfdi | .1564758 .0180645 8.66 0.000 .1210532
.1918985
laglnst~2000 | .762135 .0246798 30.88 0.000 .7137404
.8105295
lagtradegdp | .0178568 .0025859 6.91 0.000 .0127861
.0229274
lagdlgdp | .2601478 .0427744 6.08 0.000 .1762716
.3440241
_cons | 17.01131 .1701713 99.97 0.000 16.67762
17.345
-------------+----------------------------------------------------------------
sigma_u | .93048942
sigma_e | 1.0746505
rho | .42847396 (fraction of variance due to u_i)
------------------------------------------------------------------------------
F test that all u_i=0: F(148, 2544) = 10.73 Prob > F =
0.0000
. estimates store FIX, title(The FE) ;
. xtreg lfdi_2000 lagdlfdi laglnstock2000 lagtradegdp lagdlgdp, re;
Random-effects GLS regression Number of obs =
2697
Group variable (i): id_code_id Number of groups =
149
R-sq: within = 0.3606 Obs per group: min =
3
between = 0.8402 avg =
18.1
overall = 0.7128 max =
27
Random effects u_i ~ Gaussian Wald chi2(4) =
2225.46
corr(u_i, X) = 0 (assumed) Prob > chi2 =
0.0000
------------------------------------------------------------------------------
lfdi_2000 | Coef. Std. Err. z P>|z| [95% Conf.
Interval]
-------------+----------------------------------------------------------------
lagdlfdi | .1631662 .0180937 9.02 0.000 .1277032
.1986291
laglnst~2000 | .830845 .0196843 42.21 0.000 .7922645
.8694255
lagtradegdp | .011992 .0020779 5.77 0.000 .0079195
.0160645
lagdlgdp | .2558113 .0424486 6.03 0.000 .1726136
.3390091
_cons | 16.77702 .1556693 107.77 0.000 16.47191
17.08212
-------------+----------------------------------------------------------------
sigma_u | .77431228
sigma_e | 1.0746505
rho | .34173973 (fraction of variance due to u_i)
------------------------------------------------------------------------------
. estimates store RAND, title(The RE) ;
. hausman FIX RAND;
---- Coefficients ----
| (b) (B) (b-B)
sqrt(diag(V_b-V_B))
| FIX RAND Difference S.E.
-------------+----------------------------------------------------------------
lagdlfdi | .1564758 .1631662 -.0066903 .
laglnst~2000 | .762135 .830845 -.06871 .014887
lagtradegdp | .0178568 .011992 .0058648 .0015393
lagdlgdp | .2601478 .2558113 .0043365 .0052695
------------------------------------------------------------------------------
b = consistent under Ho and Ha; obtained from
xtreg
B = inconsistent under Ha, efficient under Ho; obtained from
xtreg
Test: Ho: difference in coefficients not systematic
chi2(4) = (b-B)'[(V_b-V_B)^(-1)](b-B)
= 22.94
Prob>chi2 = 0.0001
(V_b-V_B is not positive definite)
###End Stata 9.2####
On Mon, May 18, 2009 at 12:26 PM, Steven Archambault
<archstevej at gmail.com>wrote:
> Giovani,
>
> Thank you so much for your comments. I am a bit new to R, and to these
> mailing lists, so I apologize for being sparse on the details and examples.
> I am using Stata 9.2, which might be the answer to my problem, as you
> described. I have done quite a bit of internet searching, and did not read
> anywhere about the use of a different method for calculating the chi-sq
> value, so thanks for that.
>
> One more issue I have been thinking about. I am assuming your Plm package
> knows that the FE is the consistient model, as the same results arrive if
> the code is phtest(femod, remod) or phtest(remod, femod). The order does
> matter in Stata.
>
> For complteness I am going to post my results using the same Grumfeld
> dataset for both stata 9.2 (by hand calculation and canned procedure) and
> R. I am using the Plm package version 1 1-2.
>
> Regards,
> Steve
>
>
>
> ## begin Stata9.2 output##
> xtreg inv value capital, robust re;
>
> Random-effects GLS regression Number of obs =
> 200
> Group variable (i): firmid Number of groups =
> 10
>
> R-sq: within = 0.7668 Obs per group: min =
> 20
> between = 0.8196 avg =
> 20.0
> overall = 0.8061 max =
> 20
>
> Random effects u_i ~ Gaussian Wald chi2(3) =
> 77.70
>
> corr(u_i, X) = 0 (assumed) Prob > chi2 =
> 0.0000
>
>
> ------------------------------------------------------------------------------
> | Robust
> invest | Coef. Std. Err. z P>|z| [95% Conf.
> Interval]
>
> -------------+----------------------------------------------------------------
> value | .1097811 .0197587 5.56 0.000 .0710547
> .1485076
> capital | .308113 .0418387 7.36 0.000 .2261107
> .3901153
> _cons | -57.83441 24.67795 -2.34 0.019 -106.2023
> -9.466507
>
> -------------+----------------------------------------------------------------
>
> sigma_u | 84.20095
> sigma_e | 52.767964
> rho | .71800838 (fraction of variance due to u_i)
>
> ------------------------------------------------------------------------------
>
> . matrix bfe=e(b);
>
> . matrix vfe=e(V);
>
> . estimates store remod;
>
> . xtreg inv value capital, robust fe;
>
> Fixed-effects (within) regression Number of obs =
> 200
> Group variable (i): firmid Number of groups =
> 10
>
> R-sq: within = 0.7668 Obs per group: min =
> 20
> between = 0.8194 avg =
> 20.0
> overall = 0.8060 max =
> 20
>
> F(2,188) =
> 40.23
>
> corr(u_i, Xb) = -0.1517 Prob > F =
> 0.0000
>
> ------------------------------------------------------------------------------
> | Robust
> invest | Coef. Std. Err. t P>|t| [95% Conf.
> Interval]
>
> -------------+----------------------------------------------------------------
> value | .1101238 .019378 5.68 0.000 .0718975
> .1483501
> capital | .3100653 .042795 7.25 0.000 .2256452
> .3944854
> _cons | -58.74393 23.37422 -2.51 0.013 -104.8534
> -12.63449
>
> -------------+----------------------------------------------------------------
> sigma_u | 85.732501
> sigma_e | 52.767964
> rho | .72525012 (fraction of variance due to u_i)
>
> ------------------------------------------------------------------------------
>
> ###Hausman by hand###
>
> . estimates store femod;
>
> . matrix vre=e(V);
>
> . matrix bre=e(b);
>
> . matrix bdif=bfe-bre;
>
> . matrix list bdif;
>
> bdif[1,3]
> value capital _cons
> y1 -.00034265 -.00195236 .90952273
>
> . matrix bdifp=bdif';
>
> . matrix dv=vfe-vre;
>
> . matrix dvi=inv(dv);
>
> . matrix list bdif;
>
> bdif[1,3]
> value capital _cons
> y1 -.00034265 -.00195236 .90952273
>
> . matrix list bdifp;
>
> bdifp[3,1]
> y1
> value -.00034265
> capital -.00195236
> _cons .90952273
>
> . matrix list dvi;
>
> symmetric dvi[3,3]
> value capital _cons
> value -7739.3615
> capital 5808.2905 -5305.811
> _cons 3.6641311 .98569198 -.00051157
>
> . matrix chisq=bdif*dvi*bdifp;
>
> . matrix list chisq;
>
> symmetric chisq[1,1]
> y1
> y1 -.01956929
> ###Hausman canned###
> . hausman femod remod;
>
> ---- Coefficients ----
> | (b) (B) (b-B)
> sqrt(diag(V_b-V_B))
> | femod remod Difference S.E.
>
> -------------+----------------------------------------------------------------
> value | .1101238 .1097811 .0003427 .
> capital | .3100653 .308113 .0019524 .0089965
>
> ------------------------------------------------------------------------------
> b = consistent under Ho and Ha; obtained from
> xtreg
> B = inconsistent under Ha, efficient under Ho; obtained from
> xtreg
>
> Test: Ho: difference in coefficients not systematic
>
> chi2(2) = (b-B)'[(V_b-V_B)^(-1)](b-B)
> = -0.01 chi2<0 ==> model fitted on these
> data fails to meet the asymptotic
> assumptions of the Hausman test;
> see suest for a generalized test ##
> end Stata9.2 output ##
>
> ##begin Output R, using PLM 1.1-2###
>
> > test<-data(Grunfeld, package="Ecdat")
> >
> > fm <- inv~value+capital
> > femod <- plm(fm, Grunfeld, model="within")
> > summary(femod)
> Oneway (individual) effect Within Model
>
> Call:
> plm(formula = fm, data = Grunfeld, model = "within")
>
> Balanced Panel: n=10, T=20, N=200
>
> Residuals :
> Min. 1st Qu. Median 3rd Qu. Max.
> -184.000 -17.600 0.563 19.200 251.000
>
> Coefficients :
> Estimate Std. Error t-value Pr(>|t|)
> value 0.110124 0.011857 9.2879 < 2.2e-16 ***
> capital 0.310065 0.017355 17.8666 < 2.2e-16 ***
> ---
> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
> Total Sum of Squares: 2244400
> Residual Sum of Squares: 523480
> F-statistic: 309.014 on 2 and 188 DF, p-value: < 2.22e-16
>
> > remod <- plm(fm, Grunfeld, model="random")
> > summary(remod)
> Oneway (individual) effect Random Effect Model
> (Swamy-Arora's transformation)
>
> Call:
> plm(formula = fm, data = Grunfeld, model = "random")
>
> Balanced Panel: n=10, T=20, N=200
>
> Effects:
> var std.dev share
> idiosyncratic 2784.458 52.768 0.282
> individual 7089.800 84.201 0.718
> theta: 0.86122
>
> Residuals :
> Min. 1st Qu. Median 3rd Qu. Max.
> -178.00 -19.70 4.69 19.50 253.00
>
> Coefficients :
> Estimate Std. Error t-value Pr(>|t|)
> (Intercept) -57.834415 28.898935 -2.0013 0.04536 *
> value 0.109781 0.010493 10.4627 < 2e-16 ***
> capital 0.308113 0.017180 17.9339 < 2e-16 ***
> ---
> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
> Total Sum of Squares: 2381400
> Residual Sum of Squares: 548900
> F-statistic: 328.837 on 2 and 197 DF, p-value: < 2.22e-16
> > phtest(femod, remod)
>
> Hausman Test
>
> data: fm
> chisq = 2.3304, df = 2, p-value = 0.3119
> alternative hypothesis: one model is inconsistent
>
> ###end Plm###
>
>
>
>
>
> On Mon, May 18, 2009 at 6:01 AM, Millo Giovanni <
> Giovanni_Millo at generali.com> wrote:
>
>> Dear Steve,
>>
>> I got your inquiry courtesy of Christian Kleiber, who brought it to our
>> attention: please next time you post anything re a given package,
>> include the maintainer's address. We cannot guarantee to parse all the
>> daily digests of the R system!
>>
>> Your problem: can you please provide a reproducible example? Else it is
>> difficult to help, not knowing your data, your results and even the
>> Stata version you're using.
>>
>> In the following I replicate what you might have done on a well-known
>> dataset.
>>
>> From Stata10, on the usual Grunfeld data taken from package "Ecdat":
>>
>> ## begin Stata10 output ##
>> . xtreg inv value capital
>>
>> Random-effects GLS regression Number of obs =
>> 200
>> Group variable: firm Number of groups =
>> 10
>>
>> R-sq: within = 0.7668 Obs per group: min =
>> 20
>> between = 0.8196 avg =
>> 20.0
>> overall = 0.8061 max =
>> 20
>>
>> Random effects u_i ~ Gaussian Wald chi2(2) =
>> 657.67
>> corr(u_i, X) = 0 (assumed) Prob > chi2 =
>> 0.0000
>>
>> ------------------------------------------------------------------------
>> ------
>> inv | Coef. Std. Err. z P>|z| [95% Conf.
>> Interval]
>> -------------+----------------------------------------------------------
>> ------
>> value | .1097811 .0104927 10.46 0.000 .0892159
>> .1303464
>> capital | .308113 .0171805 17.93 0.000 .2744399
>> .3417861
>> _cons | -57.83441 28.89893 -2.00 0.045 -114.4753
>> -1.193537
>> -------------+----------------------------------------------------------
>> ------
>> sigma_u | 84.20095
>> sigma_e | 52.767964
>> rho | .71800838 (fraction of variance due to u_i)
>> ------------------------------------------------------------------------
>> ------
>>
>> . estimates store remod
>>
>> . xtreg inv value capital, fe
>>
>> Fixed-effects (within) regression Number of obs =
>> 200
>> Group variable: firm Number of groups =
>> 10
>>
>> R-sq: within = 0.7668 Obs per group: min =
>> 20
>> between = 0.8194 avg =
>> 20.0
>> overall = 0.8060 max =
>> 20
>>
>> F(2,188) =
>> 309.01
>> corr(u_i, Xb) = -0.1517 Prob > F =
>> 0.0000
>>
>> ------------------------------------------------------------------------
>> ------
>> inv | Coef. Std. Err. t P>|t| [95% Conf.
>> Interval]
>> -------------+----------------------------------------------------------
>> ------
>> value | .1101238 .0118567 9.29 0.000 .0867345
>> .1335131
>> capital | .3100653 .0173545 17.87 0.000 .2758308
>> .3442999
>> _cons | -58.74393 12.45369 -4.72 0.000 -83.31086
>> -34.177
>> -------------+----------------------------------------------------------
>> ------
>> sigma_u | 85.732501
>> sigma_e | 52.767964
>> rho | .72525012 (fraction of variance due to u_i)
>> ------------------------------------------------------------------------
>> ------
>> F test that all u_i=0: F(9, 188) = 49.18 Prob > F =
>> 0.0000
>>
>> . estimates store femod
>>
>> . hausman femod remod
>>
>> ---- Coefficients ----
>> | (b) (B) (b-B)
>> sqrt(diag(V_b-V_B))
>> | femod remod Difference S.E.
>> -------------+----------------------------------------------------------
>> ------
>> value | .1101238 .1097811 .0003427 .0055213
>> capital | .3100653 .308113 .0019524 .0024516
>> ------------------------------------------------------------------------
>> ------
>> b = consistent under Ho and Ha; obtained from
>> xtreg
>> B = inconsistent under Ha, efficient under Ho; obtained from
>> xtreg
>>
>> Test: Ho: difference in coefficients not systematic
>>
>> chi2(2) = (b-B)'[(V_b-V_B)^(-1)](b-B)
>> = 2.33
>> Prob>chi2 = 0.3119
>>
>> .
>> ## end Stata10 output ##
>>
>> while from plm I get
>>
>> ## begin R putput ##
>> > data(Grunfeld, package="Ecdat")
>> > fm <- inv~value+capital
>> >
>> > femod <- plm(fm, Grunfeld)
>> > remod <- plm(fm, Grunfeld, model="random")
>> >
>> > phtest(femod, remod)
>>
>> Hausman Test
>>
>> data: fm
>> chisq = 2.3304, df = 2, p-value = 0.3119
>> alternative hypothesis: one model is inconsistent
>>
>> ## end R output ##
>>
>> which, besides testifying to the goodness and parsimony of an
>> object-oriented approach as far as screen output is concerned, looks
>> rather consistent to me.
>>
>> I cannot but guess that the problem might stem from different RE
>> estimates: previous versions of Stata used the Wallace-Hussein method by
>> default for computing the variance of random effects. Now Stata uses
>> Swamy-Arora, which has been the default of 'plm' since the beginning.
>> Yet as plm() allows to choose, you can experiment with different values
>> for the 'random.method' argument in order to see if you get the Stata
>> result. I suggest you start by comparing the coefficient estimates you
>> get from Stata and R: FE should be unambiguous, RE might vary as said
>> above, and for good reason.
>>
>> You also didn't tell us whether your by-hand calculation agrees with
>> phtest() output? (I guess it does not)
>>
>> Please let us know, possibly with a reproducible example and providing
>> all the above info
>> Giovanni
>>
>> PS please also make sure you're not using any VEEEEERY old version of
>> 'plm' (prior to, say, 0.3): these had a bug in the p-value calculation
>> which made it depend on the order of models compared (so that in the
>> wrong case you got p.value=1).
>>
>> Giovanni Millo
>> Research Dept.,
>> Assicurazioni Generali SpA
>> Via Machiavelli 4,
>> 34132 Trieste (Italy)
>> tel. +39 040 671184
>> fax +39 040 671160
>>
>> > ----------------------------------------------------------------------
>> > --
>> >
>> > Subject:
>> > [R-SIG-Finance] Chi-sq Hausman test---R vs Stata
>> > From:
>> > Steven Archambault <archstevej at gmail.com>
>> > Date:
>> > Sun, 17 May 2009 23:14:13 -0600
>> > To:
>> > r-sig-finance at stat.math.ethz.ch
>> >
>> > To:
>> > r-sig-finance at stat.math.ethz.ch
>> >
>> >
>> > Hi all,
>> >
>> > I am running a panel time series regression testing Fixed Effects and
>> > Random Effects. I decided to calculate the chi-sq value for the
>> > Hausman test in both R (Phtest) and Stata. I get different results.
>> > Even within Stata, calculating the Chi-sq value with the canned
>> > procedure or by hand (using
>> > matrices) gives different results. So, the question should come up
>> there as
>> > well.
>> >
>> > Does anybody have any insight on how to pick which results to use? I
>> > guess the one that gives the result I want? Having different programs
>> > give quite different values for the same tests is frustrating me. I'd
>>
>> > be interested in any feedback folks have!
>> >
>> > Thanks,
>> > Steve
>> >
>> > [[alternative HTML version deleted]]
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://stat.ethz.ch/pipermail/r-sig-finance/attachments/20090518/c6b65272/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: fdi_data.csv
Type: application/vnd.ms-excel
Size: 174320 bytes
Desc: not available
URL: <https://stat.ethz.ch/pipermail/r-sig-finance/attachments/20090518/c6b65272/attachment.xlb>
More information about the R-SIG-Finance
mailing list