[R] problem with mode of marginal distriubtion of rdirichlet{gtools}
Rob Carnell
carnellr at battelle.org
Sun Oct 22 19:04:28 CEST 2006
hong qin <alongway <at> gmail.com> writes:
>
> Hi all,
>
> I have a problem using rdirichlet{gtools}.
> For Dir( a1, a2, ..., a_n), its mode can be found at $( a_i -1)/ (
> \sum_{i}a_i - n)$;
> The means are $a_i / (\sum_{i} a_i ) $;
>
> I tried to study the above properties using rdirichlet from gtools. The code
> are:
>
> ##############
> library(gtools)
> alpha = c(1,3,9) #totoal=13
I agree with your assertions up to here. The proofs that I have seen for the
mode of the dirichlet assert that a_i > 1. This is important. So I propose:
alpha = c(2, 4, 7) # total = 13
And I will continue with the example, commenting out your example for
comparison...
# mean.expect = c(1/13, 3/13, 9/13)
# mode.expect = c(0, 2/10, 8/10) # this is for the overall mode.
mean.expect = c(2/13, 4/13, 7/13)
mode.expect = c(1/10, 3/10, 6/10)
You were also correct that the expected mode is for the joint distribution.
For the marginal distributions, there is a slightly different expected mode.
The marginal distributions of a dirichlet are beta distributions with an alpha
parameter equal to a_i and a beta parameter equal to (\sum{j, j!=i} a_j). The
mode of the beta distribution is (alpha - 1) / (alpha + beta - 2). The first
marginal therefore has a mode of (2 - 1) / (2 + 11 -2) = 1/11
mode.expect.beta = c(1/11, 3/11, 6/11)
I also propose that we look at the smoothed density estimate to find the mean
instead of relying on histograms which have error in their mode estimate due
to the bin size.
mode1 = numeric(3);
mode2 = numeric(3);
theta = data.frame( rdirichlet( 100000, alpha) )
m1 = mean(theta)
for( i in 1:3) {
h2 <- density(theta[,i], n=1024)
mode1[i] <- h2$x[h2$y==max(h2$y)]
}
theta = data.frame( rdirichlet( 100000, alpha) )
m2 = mean(theta)
for( i in 1:3) {
h2 <- density(theta[,i], n=1024)
mode2[i] <- h2$x[h2$y==max(h2$y)]
}
The output then shows that the density follows the mode.expect.beta.
> rbind(m1,m2,mean.expect)
X1 X2 X3
m1 0.1539895 0.3078219 0.5381886
m2 0.1537157 0.3072688 0.5390156
mean.expect 0.1538462 0.3076923 0.5384615
> rbind(mode1, mode2, mode.expect, mode.expect.beta);
[,1] [,2] [,3]
mode1 0.09231188 0.2687344 0.5426022
mode2 0.09039729 0.2757893 0.5608050
mode.expect 0.10000000 0.3000000 0.6000000
mode.expect.beta 0.09090909 0.2727273 0.5454545
>
Finally, we can show that the joint distribution mode is greater than the
joint density at the values of the modes of the marginal densities.
> ddirichlet(c(1/10, 3/10, 6/10), c(2,4,7)) # joint mode
[1] 13.96769
> ddirichlet(c(1/11, 3/11, 6/11), c(2,4,7)) # marginal modes
[1] 5.385148
>
I hope this helps!
Rob Carnell
Battelle - Statistics and Information Analysis
More information about the R-help
mailing list