[R-sig-ME] nonlinear behavior of effective sample size
Ross Boylan
ross at biostat.ucsf.edu
Fri Jul 5 20:07:44 CEST 2013
Increasing the markov chain length from 100 to 1000 did not produce a nearly proportionate boost in effective sample size in some
work I'm doing. The chain comes from a hybrid Monte Carlo sampler.
Is this as weird as it seems? Any ideas what to look for?
BTW, is this the right place for Bayesian questions?
> print(effectiveSize(MCTRACE[1:1000, 1:11]), digits=1)
intrcpt age rac_gay nPartners ethnic.1 ethnic.2 HIV.1 HIV.2
37 42 33 28 50 46 116 49
tau theta1 theta2
3 59 10
> print(effectiveSize(MCTRACE[1:100, 1:11]), digits=1)
intrcpt age rac_gay nPartners ethnic.1 ethnic.2 HIV.1 HIV.2
7 9 8 22 26 53 27 6
tau theta1 theta2
7 9 8
A couple are worse, and none is 10x bigger. Here are the ratios:
> print(effectiveSize(MCTRACE[1:1000,1:11])/effectiveSize(MCTRACE[1:100, 1:11]), digits=1)
intrcpt age rac_gay nPartners ethnic.1 ethnic.2 HIV.1 HIV.2
5.3 4.5 3.9 1.3 1.9 0.9 4.4 8.5
tau theta1 theta2
0.4 6.8 1.2
On reflection it's not suprising the relation isn't perfectly linear since there's autocorrelation between the first 100 and the rest,
and the autocorrelations will vary somewhat since the measured autocorrelations are just statistics, but still....
The adjustment formula's denominator (details below) apparently either is or is equivalent to summing the autocorrelations at all
lags; I suppose allowing more lags in principle means the sum will increase.
Definitions/documentation
Definitions
ESS is usually defined as
ESS(theta) = S / (1 + 2 sum[k] rho[k] (theta)),
where S is the number of posterior samples, rho[k] is the autocorrelation at lag k, and theta is the vector of marginal posterior
samples. The infinite sum is often truncated at lag k when rho[k](theta) < 0.05. Just as with the effectiveSize function in the coda
package, the AIC argument in the ar function is used to estimate the order.
Elsewhere I find that
Estimation of the effective sample size requires estimating the
spectral density at frequency zero. This is done by the function
‘spectrum0.ar’, which in turn fits an autoregressive model.
ESS from LaplacesDemon and effectiveSize from coda give the same results.
Ross Boylan
More information about the R-sig-mixed-models
mailing list