[R-SIG-Finance] Different results of vcovCL (sandwich) and of cluster() in Stata

Igor Sosa Mayor joseleopoldo1792 at gmail.com
Mon Aug 7 10:05:32 CEST 2017


Hi,

(first of all, sorry for the crossposting. I sent this message to the
general R-list, but I noticed, this list is maybe more appropiate for this
question).

I'm trying to reproduce with R the results of this study:
https://learn.gold.ac.uk/mod/resource/view.php?id=262406

More precisely I want to reproduce the results of the table 6 (pag.280),
which can also be seen here:
http://picpaste.de/pics/table-robin-llKCOeWV.1501745645.png

Let's take the 4th column: we have a coeff. of 0.042 and a SE of
0.036, which represents clustered robust standard errors. If I try to
reproduce in R the analysis, I get the same coefficient, but I'm not
able to get the same SE.

The author made the stata file available here:
https://drive.google.com/file/d/0B_QoCd-1jkVXTmNDWmViWkJFdmM/edit?usp=sharing
(see: http://www.jaredcrubin.com/research)

To make the regression, he uses (as far as I can understand the stata
code) the following command:

local conditions "city != "Mainz" & city != "Wittenberg" & city != "Zürich""
local citysize "logpop1500 i.indepcity"
local demandcity "i.univ1450 i.bishop i.laymag"
probit prot`y' i.press `citysize' `demandcity' if `conditions', robust
cluster(territory)
    estpost margins, dydx(*)

I'm trying to translate this into R-code doing the following:

library(foreign)
library(dplyr)
library(sandwich)
library(margins)

# the data are here:
#
https://drive.google.com/file/d/0B_QoCd-1jkVXRGdUMTlkYTNiNGc/edit?usp=sharing
cities <- read.dta("data/Printing_and_Protestants_Data-ReStat.dta")

# we filter the data
cities <- filter(cities, !is.na(pop1500))
cities <- filter(cities, city != "Zürich" & city != "Mainz" & city !=
"Wittenberg")

# the model
m1 <- glm(formula = factor(prot1530) ~ factor(press) + logpop1500 +
                factor(indepcity) + factor(univ1450) + factor(bishop) +
factor(laymag),
            family = binomial(link = "probit"), data = cities)

# calculate margins with clustered standard errors
m1.margins <- margins(m1, vcov = vcovCL(m1, cluster=cities$territory))

I tried different types (HC1, HC2, etc), but always the value for the SE
is not the same as in the table.

Any ideas?

Many thanks in advance.

	[[alternative HTML version deleted]]



More information about the R-SIG-Finance mailing list