[R-sig-ME] MCMCglmm for binomial models?

Whit Armstrong armstrong.whit at gmail.com
Tue Sep 21 21:41:06 CEST 2010


Bob,

Just wanted to post the results from my project.  It is still very
much alpha quality software, but the speed is good.  Your sample model
runs in 22secs on my workstation.  I ran the model in JAGS earlier but
it took ages (20min or so).

These are my results which are closer to the MCMCglmm run than the
WinBUGS run.  Eventually, I'll try to make it easy to run these models
directly from R.

The full code for this model is posted in-line below.  However, the
guts are just two functions which just separate the winbugs model into
separate "update" and "logp" functions:

  void update() {
    phi.value = b0.value + b_period2.value*period2 +
b_period3.value*period3 + b_period4.value*period4 +
sum(permutation_matrix*b_herd.value,1) + overdisp.value;
    phi.value = 1/(1+exp(-phi.value));
    sigma_overdisp.value = 1/sqrt(tau_overdisp.value);
    sigma_b_herd.value = 1/sqrt(tau_b_herd.value);
  }
  double logp() const {
    return b0.logp() + b_period2.logp() + b_period3.logp() +
b_period4.logp() + tau_overdisp.logp() + tau_b_herd.logp() +
b_herd.logp(0, tau_b_herd.value) +
      overdisp.logp(0,tau_overdisp.value) + likelihood.logp(size,phi.value);
  }

The speed of the model could probably be improved by consolidating all
the b's into a vector.  I'll try that later on.  Code for this model,
and some other examples will go up on github sometime in the near
future.

I've also cc'd Chris Fonnesbeck, since he may want to chime in on the
pymc front.  I haven't had time to code up an equivalent pymc model,
but it should be very easy to do.

Feedback welcome.

-Whit


results:
warmstrong at krypton:~/dvl/c++/CppBugs/test$ time ./a.out
samples: 17999
b0: -1.52515
b_period2: -1.21971
b_period3: -1.32685
b_period4: -1.88475
tau_overdisp: 1.59118
tau_b_herd: 25.9476
sigma_overdisp: 0.886191
sigma_b_herd: 0.258111
b_herd:
   0.1623
  -0.0524
   0.0852
   0.0080
  -0.0506
  -0.0953
   0.1859
   0.0835
  -0.0690
  -0.0897
   0.0096
  -0.0183
  -0.1496
   0.1133
  -0.1061


real	0m21.930s
user	0m21.920s
sys	0m0.000s
warmstrong at krypton:~/dvl/c++/CppBugs/test$


model code:
class HerdModel: public MCModel {
  const ivec incidence;
  const ivec size;
  const ivec herd;
  const vec period2;
  const vec period3;
  const vec period4;
  int N, N_herd;
  mat permutation_matrix;

public:
  NormalStatic<double> b0;
  NormalStatic<double> b_period2;
  NormalStatic<double> b_period3;
  NormalStatic<double> b_period4;
  UniformStatic<double> tau_overdisp;
  UniformStatic<double> tau_b_herd;
  Deterministic<double> sigma_overdisp;
  Deterministic<double> sigma_b_herd;
  Normal<vec> b_herd;
  Normal<vec> overdisp;
  Deterministic<vec> phi;
  Binomial<ivec> likelihood;


  HerdModel(const ivec& incidence_,const ivec& size_,const ivec& herd_,
            const vec& period2_,const vec& period3_,const vec&
period4_, int N_, int N_herd_):
    incidence(incidence_),size(size_),herd(herd_),
    period2(period2_),period3(period3_),period4(period4_),
    N(N_),N_herd(N_herd_),permutation_matrix(N,N_herd),
    b0(0,0,0.001),b_period2(0,0,0.001),b_period3(0,0,0.001),b_period4(0,0,0.001),
    tau_overdisp(1,0,1000),tau_b_herd(1,0,100),
    sigma_overdisp(1),sigma_b_herd(1),
    b_herd(randn<vec>(N_herd_)),overdisp(randn<vec>(N)),
    phi(randu<vec>(N)),
    likelihood(incidence_,true)
  {
    permutation_matrix.fill(0.0);
    for(uint i = 0; i < herd.n_elem; i++) {
      permutation_matrix(i,herd[i]) = 1.0;
    }
    add(b0);
    add(b_period2);
    add(b_period3);
    add(b_period4);
    add(tau_overdisp);
    add(tau_b_herd);
    add(sigma_overdisp);
    add(sigma_b_herd);
    add(b_herd);
    add(overdisp);
    add(phi);
    add(likelihood);
  }

  void update() {
    phi.value = b0.value + b_period2.value*period2 +
b_period3.value*period3 + b_period4.value*period4 +
sum(permutation_matrix*b_herd.value,1) + overdisp.value;
    phi.value = 1/(1+exp(-phi.value));
    sigma_overdisp.value = 1/sqrt(tau_overdisp.value);
    sigma_b_herd.value = 1/sqrt(tau_b_herd.value);
  }
  double logp() const {
    return b0.logp() + b_period2.logp() + b_period3.logp() +
b_period4.logp() + tau_overdisp.logp() + tau_b_herd.logp() +
b_herd.logp(0, tau_b_herd.value) +
      overdisp.logp(0,tau_overdisp.value) + likelihood.logp(size,phi.value);
  }
};



On Mon, Sep 20, 2010 at 5:47 PM, Bob Farmer <farmerb at gmail.com> wrote:
> Thanks for the further detail.  I wasn't familiar with these two
> methods of overdispersion; I think I only understood them in the
> WinBUGS sense, which, I believe, is additive (e.g. overdisp[i] as a
> parameter).  I'll have to explore the family() helpfiles a bit more.
>
> Below is a summary of the model outputs from the three methods
> described earlier.  Note that here, I bumped the WinBUGS iterations to
> 750E3 to reduce the Rhats a bit further.
> I also use an adaptation of the bugsParallel function to make my
> WinBUGS runs 3x faster (time for this run:  5.8 minutes); I can
> provide details if anybody else would like to use this function (and
> some other bugs summary functions I've written).
> Hope this is useful or interesting -- I appreciate this discussion!
>
> --Bob
>
> **glmer**
>> summary(gm3)@coefs
>              Estimate Std. Error   t value
> (Intercept) -1.3985351 0.01698321 -82.34810
> period2     -0.9923347 0.02275838 -43.60304
> period3     -1.1286754 0.02429833 -46.45075
> period4     -1.5803739 0.03195596 -49.45474
>
> **MCMCglmm**
>> summary(mc3)
>  Iterations = 749351
>  Thinning interval  = 100001
>  Sample size  = 1000
>  DIC: 539.7889
>  G-structure:  ~herd
>     post.mean  l-95% CI u-95% CI eff.samp
> herd    0.2894 1.309e-06    1.035     1000
>  R-structure:  ~units
>      post.mean l-95% CI u-95% CI eff.samp
> units      0.93 0.003825     1.98     1000
>  Location effects: cbind(incidence, size - incidence) ~ period
>            post.mean l-95% CI u-95% CI eff.samp  pMCMC
> (Intercept)   -1.5263  -2.2221  -0.9602   1000.0 <0.001 ***
> period2       -1.2407  -2.2072  -0.2449    947.2  0.012 *
> period3       -1.3574  -2.4534  -0.3472   1000.0  0.012 *
> period4       -1.9126  -3.2662  -0.8298   1136.0  0.002 **
>
> **WinBUGS**
>> bug3$summary[which(rownames(bug3$summary) %in% params), c("mean", "sd", "2.5%", "97.5%", "Rhat", "n.eff")]
>                     mean        sd        2.5%      97.5%     Rhat n.eff
> B.0            -1.6070301 0.3556674 -2.33980000 -0.9947425 1.039548    69
> B.period2      -1.2761798 0.4895902 -2.14292500 -0.2391200 1.018612   210
> B.period3      -1.5100405 0.6337780 -2.63680000 -0.2266425 1.123427    22
> B.period4      -2.0678331 0.6539004 -3.31892500 -0.8586700 1.018622   270
> sigma.overdisp  1.1975969 0.3147531  0.69358404  1.8740000 1.045062    55
> sigma.b.herd    0.5120043 0.3825848  0.01658429  1.3659249 1.542432     7
> (note that I can provide a link to parameter histograms, if you want
> more detail)
>
> --Bob
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>




More information about the R-sig-mixed-models mailing list