[R-sig-ME] MCMCglmm for binomial models?
Whit Armstrong
armstrong.whit at gmail.com
Tue Sep 21 22:26:57 CEST 2010
so much for intuition. consolidating the b's doesn't make any
material speed difference, but it does simplify the model code a bit.
btw, this run and the previous post were using 1e6 interations, 1e5
burn-in, and 50 thin: m.sample(1e6,1e5,50).
-Whit
warmstrong at krypton:~/dvl/c++/CppBugs/test$ g++ -I.. -Wall -O2
herd.fast.cpp -llapack
warmstrong at krypton:~/dvl/c++/CppBugs/test$ time ./a.out
samples: 17999
b:
-1.5077
-1.2344
-1.3518
-1.8819
tau_overdisp: 1.54806
tau_b_herd: 22.059
sigma_overdisp: 0.878115
sigma_b_herd: 0.245386
b_herd:
0.1464
-0.0475
0.0739
0.0070
-0.0353
-0.0866
0.1661
0.0658
-0.0585
-0.0904
0.0052
-0.0138
-0.1324
0.0947
-0.0902
real 0m21.905s
user 0m21.880s
sys 0m0.020s
warmstrong at krypton:~/dvl/c++/CppBugs/test$
updated code:
class HerdModel: public MCModel {
const ivec incidence;
const ivec size;
const ivec herd;
const mat fixed;
int N, N_herd;
mat permutation_matrix;
public:
NormalStatic<vec> b;
UniformStatic<double> tau_overdisp;
UniformStatic<double> tau_b_herd;
Deterministic<double> sigma_overdisp;
Deterministic<double> sigma_b_herd;
Normal<vec> b_herd;
Normal<vec> overdisp;
Deterministic<vec> phi;
Binomial<ivec> likelihood;
HerdModel(const ivec& incidence_,const ivec& size_,const ivec&
herd_,const mat& fixed_,int N_, int N_herd_):
incidence(incidence_),size(size_),herd(herd_),
fixed(fixed_),N(N_),N_herd(N_herd_),permutation_matrix(N,N_herd),
b(randn<vec>(4),0,0.001),tau_overdisp(1,0,1000),tau_b_herd(1,0,100),
sigma_overdisp(1),sigma_b_herd(1),
b_herd(randn<vec>(N_herd_)),overdisp(randn<vec>(N)),
phi(randu<vec>(N)),
likelihood(incidence_,true)
{
permutation_matrix.fill(0.0);
for(uint i = 0; i < herd.n_elem; i++) {
permutation_matrix(i,herd[i]) = 1.0;
}
add(b);
add(tau_overdisp);
add(tau_b_herd);
add(sigma_overdisp);
add(sigma_b_herd);
add(b_herd);
add(overdisp);
add(phi);
add(likelihood);
}
void update() {
phi.value = fixed*b.value + sum(permutation_matrix*b_herd.value,1)
+ overdisp.value;
phi.value = 1/(1+exp(-phi.value));
sigma_overdisp.value = 1/sqrt(tau_overdisp.value);
sigma_b_herd.value = 1/sqrt(tau_b_herd.value);
}
double logp() const {
return b.logp() + tau_overdisp.logp() + tau_b_herd.logp() +
b_herd.logp(0, tau_b_herd.value) +
overdisp.logp(0,tau_overdisp.value) + likelihood.logp(size,phi.value);
}
};
On Tue, Sep 21, 2010 at 3:41 PM, Whit Armstrong
<armstrong.whit at gmail.com> wrote:
> Bob,
>
> Just wanted to post the results from my project. It is still very
> much alpha quality software, but the speed is good. Your sample model
> runs in 22secs on my workstation. I ran the model in JAGS earlier but
> it took ages (20min or so).
>
> These are my results which are closer to the MCMCglmm run than the
> WinBUGS run. Eventually, I'll try to make it easy to run these models
> directly from R.
>
> The full code for this model is posted in-line below. However, the
> guts are just two functions which just separate the winbugs model into
> separate "update" and "logp" functions:
>
> void update() {
> phi.value = b0.value + b_period2.value*period2 +
> b_period3.value*period3 + b_period4.value*period4 +
> sum(permutation_matrix*b_herd.value,1) + overdisp.value;
> phi.value = 1/(1+exp(-phi.value));
> sigma_overdisp.value = 1/sqrt(tau_overdisp.value);
> sigma_b_herd.value = 1/sqrt(tau_b_herd.value);
> }
> double logp() const {
> return b0.logp() + b_period2.logp() + b_period3.logp() +
> b_period4.logp() + tau_overdisp.logp() + tau_b_herd.logp() +
> b_herd.logp(0, tau_b_herd.value) +
> overdisp.logp(0,tau_overdisp.value) + likelihood.logp(size,phi.value);
> }
>
> The speed of the model could probably be improved by consolidating all
> the b's into a vector. I'll try that later on. Code for this model,
> and some other examples will go up on github sometime in the near
> future.
>
> I've also cc'd Chris Fonnesbeck, since he may want to chime in on the
> pymc front. I haven't had time to code up an equivalent pymc model,
> but it should be very easy to do.
>
> Feedback welcome.
>
> -Whit
>
>
> results:
> warmstrong at krypton:~/dvl/c++/CppBugs/test$ time ./a.out
> samples: 17999
> b0: -1.52515
> b_period2: -1.21971
> b_period3: -1.32685
> b_period4: -1.88475
> tau_overdisp: 1.59118
> tau_b_herd: 25.9476
> sigma_overdisp: 0.886191
> sigma_b_herd: 0.258111
> b_herd:
> 0.1623
> -0.0524
> 0.0852
> 0.0080
> -0.0506
> -0.0953
> 0.1859
> 0.0835
> -0.0690
> -0.0897
> 0.0096
> -0.0183
> -0.1496
> 0.1133
> -0.1061
>
>
> real 0m21.930s
> user 0m21.920s
> sys 0m0.000s
> warmstrong at krypton:~/dvl/c++/CppBugs/test$
>
>
> model code:
> class HerdModel: public MCModel {
> const ivec incidence;
> const ivec size;
> const ivec herd;
> const vec period2;
> const vec period3;
> const vec period4;
> int N, N_herd;
> mat permutation_matrix;
>
> public:
> NormalStatic<double> b0;
> NormalStatic<double> b_period2;
> NormalStatic<double> b_period3;
> NormalStatic<double> b_period4;
> UniformStatic<double> tau_overdisp;
> UniformStatic<double> tau_b_herd;
> Deterministic<double> sigma_overdisp;
> Deterministic<double> sigma_b_herd;
> Normal<vec> b_herd;
> Normal<vec> overdisp;
> Deterministic<vec> phi;
> Binomial<ivec> likelihood;
>
>
> HerdModel(const ivec& incidence_,const ivec& size_,const ivec& herd_,
> const vec& period2_,const vec& period3_,const vec&
> period4_, int N_, int N_herd_):
> incidence(incidence_),size(size_),herd(herd_),
> period2(period2_),period3(period3_),period4(period4_),
> N(N_),N_herd(N_herd_),permutation_matrix(N,N_herd),
> b0(0,0,0.001),b_period2(0,0,0.001),b_period3(0,0,0.001),b_period4(0,0,0.001),
> tau_overdisp(1,0,1000),tau_b_herd(1,0,100),
> sigma_overdisp(1),sigma_b_herd(1),
> b_herd(randn<vec>(N_herd_)),overdisp(randn<vec>(N)),
> phi(randu<vec>(N)),
> likelihood(incidence_,true)
> {
> permutation_matrix.fill(0.0);
> for(uint i = 0; i < herd.n_elem; i++) {
> permutation_matrix(i,herd[i]) = 1.0;
> }
> add(b0);
> add(b_period2);
> add(b_period3);
> add(b_period4);
> add(tau_overdisp);
> add(tau_b_herd);
> add(sigma_overdisp);
> add(sigma_b_herd);
> add(b_herd);
> add(overdisp);
> add(phi);
> add(likelihood);
> }
>
> void update() {
> phi.value = b0.value + b_period2.value*period2 +
> b_period3.value*period3 + b_period4.value*period4 +
> sum(permutation_matrix*b_herd.value,1) + overdisp.value;
> phi.value = 1/(1+exp(-phi.value));
> sigma_overdisp.value = 1/sqrt(tau_overdisp.value);
> sigma_b_herd.value = 1/sqrt(tau_b_herd.value);
> }
> double logp() const {
> return b0.logp() + b_period2.logp() + b_period3.logp() +
> b_period4.logp() + tau_overdisp.logp() + tau_b_herd.logp() +
> b_herd.logp(0, tau_b_herd.value) +
> overdisp.logp(0,tau_overdisp.value) + likelihood.logp(size,phi.value);
> }
> };
>
>
>
> On Mon, Sep 20, 2010 at 5:47 PM, Bob Farmer <farmerb at gmail.com> wrote:
>> Thanks for the further detail. I wasn't familiar with these two
>> methods of overdispersion; I think I only understood them in the
>> WinBUGS sense, which, I believe, is additive (e.g. overdisp[i] as a
>> parameter). I'll have to explore the family() helpfiles a bit more.
>>
>> Below is a summary of the model outputs from the three methods
>> described earlier. Note that here, I bumped the WinBUGS iterations to
>> 750E3 to reduce the Rhats a bit further.
>> I also use an adaptation of the bugsParallel function to make my
>> WinBUGS runs 3x faster (time for this run: 5.8 minutes); I can
>> provide details if anybody else would like to use this function (and
>> some other bugs summary functions I've written).
>> Hope this is useful or interesting -- I appreciate this discussion!
>>
>> --Bob
>>
>> **glmer**
>>> summary(gm3)@coefs
>> Estimate Std. Error t value
>> (Intercept) -1.3985351 0.01698321 -82.34810
>> period2 -0.9923347 0.02275838 -43.60304
>> period3 -1.1286754 0.02429833 -46.45075
>> period4 -1.5803739 0.03195596 -49.45474
>>
>> **MCMCglmm**
>>> summary(mc3)
>> Iterations = 749351
>> Thinning interval = 100001
>> Sample size = 1000
>> DIC: 539.7889
>> G-structure: ~herd
>> post.mean l-95% CI u-95% CI eff.samp
>> herd 0.2894 1.309e-06 1.035 1000
>> R-structure: ~units
>> post.mean l-95% CI u-95% CI eff.samp
>> units 0.93 0.003825 1.98 1000
>> Location effects: cbind(incidence, size - incidence) ~ period
>> post.mean l-95% CI u-95% CI eff.samp pMCMC
>> (Intercept) -1.5263 -2.2221 -0.9602 1000.0 <0.001 ***
>> period2 -1.2407 -2.2072 -0.2449 947.2 0.012 *
>> period3 -1.3574 -2.4534 -0.3472 1000.0 0.012 *
>> period4 -1.9126 -3.2662 -0.8298 1136.0 0.002 **
>>
>> **WinBUGS**
>>> bug3$summary[which(rownames(bug3$summary) %in% params), c("mean", "sd", "2.5%", "97.5%", "Rhat", "n.eff")]
>> mean sd 2.5% 97.5% Rhat n.eff
>> B.0 -1.6070301 0.3556674 -2.33980000 -0.9947425 1.039548 69
>> B.period2 -1.2761798 0.4895902 -2.14292500 -0.2391200 1.018612 210
>> B.period3 -1.5100405 0.6337780 -2.63680000 -0.2266425 1.123427 22
>> B.period4 -2.0678331 0.6539004 -3.31892500 -0.8586700 1.018622 270
>> sigma.overdisp 1.1975969 0.3147531 0.69358404 1.8740000 1.045062 55
>> sigma.b.herd 0.5120043 0.3825848 0.01658429 1.3659249 1.542432 7
>> (note that I can provide a link to parameter histograms, if you want
>> more detail)
>>
>> --Bob
>>
>> _______________________________________________
>> R-sig-mixed-models at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>
>
More information about the R-sig-mixed-models
mailing list