[R] polr probit versus stata oprobit
Jean Eid
jeaneid at chass.utoronto.ca
Thu Nov 11 01:46:21 CET 2004
Dear All,
I have been struggling to understand why for the housing data in MASS
library R and stata give coef. estimates that are really different. I also
tried to come up with many many examples myself (see below, of course I
did not have the set.seed command included) and all of my
`random' examples seem to give verry similar output. For the housing data,
I have changed the data into numeric vectors instead of factors/ordered
factors. I did so to try and get the same results as in STATA and to have
the housing example as close as possible to the one I constructed.
I run a debian sid, kernel 2.4, R 2.0.0, and STATA version 8.2, MASS
version 7.2-8.
here's the example ( I assume that you have STATA installed and can run in
batch mode, if not the output is also given below)
library(MASS)
library(foreign)
set.seed(100)
X <- rnorm(1000)
X1 <- rnorm(1000)
X2 <- rnorm(1000)
X <- X +X1+X2
XX <- X<=quantile(X, .25)
XX[X>quantile(X, .25) & X<=quantile(X, .50)] <- 2
XX[X>quantile(X, .5) & X<=quantile(X, .75)] <- 3
XX[X>quantile(X, .75)] <- 4
temp <- data.frame(XX=XX, X1=X1, X2=X2, X=X)
summary(polr(factor(XX)~X1 +X2, data=temp, method="probit"))
write.dta(temp, "temp.dta")
####################################
#Stata stuff
####################################
cat("use temp.dta\n oprobit XX X1 X2\n", file="temp.ado")
system("stata -b do temp.ado&")
system("cat temp.log")
#
##### here's R's output
#############################
Re-fitting to get Hessian
Call:
polr(formula = factor(XX) ~ X1 + X2, data = temp, method = "probit")
Coefficients:
Value Std. Error t value
X1 0.9891735 0.04749225 20.82811
X2 0.9400804 0.04527653 20.76309
Intercepts:
Value Std. Error t value
1|2 -1.1411 0.0572 -19.9613
2|3 -0.0372 0.0486 -0.7656
3|4 1.1101 0.0579 19.1865
Residual Deviance: 1969.734
AIC: 1979.734
##############################################
#and here Stata's output
##############################################
Ordered probit estimates Number of obs =
1000
LR chi2(2) =
802.86
Prob > chi2 =
0.0000
Log likelihood = -984.86675 Pseudo R2 =
0.2896
------------------------------------------------------------------------------
XX | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
X1 | .9891651 .0474922 20.83 0.000 .8960822 1.082248
X2 | .94007 .0452765 20.76 0.000 .8513298 1.02881
-------------+----------------------------------------------------------------
_cut1 | -1.141119 .0571667 (Ancillary parameters)
_cut2 | -.0371779 .0485592
_cut3 | 1.110117 .0578593
More information about the R-help
mailing list