[R] Fitting Beta Distribution

Thu Dec 21 12:12:57 CET 2017

I answer my own question: I had overlooked the fact that the normalization
factor is also a function of the parameters I want to optimise, hence I
should write

dbeta2 <- function(x, shape){

    res <- x^(shape-1)*(1-x)^(shape-1)/beta(shape, shape)

    return(res)

}

after which the results are consistent.

---------- Forwarded message ----------
From: Lorenzo Isella <lorenzo.isella at yopmail.com>
Date: 21 December 2017 at 11:29
Subject: Fitting Beta Distribution
To: "r-help at r-project.org" <r-help at r-project.org>

Dear All,
I need to fit a custom probability density (based on the symmetric beta
distribution B(shape, shape), where the two parameters shape1 and shape2
are identical) to my data.
The trouble is that I experience some problems also when dealing with the
plain vanilla symmetric beta distribution.
Please consider the code at the end of the email.
In the code, dbeta1 is the density of the beta distribution for
shape1=shape2=shape.
In the code, dbeta2 is the same quantity written explicitly, without the
normalization factor (which should not matter at all if we talk about
maximizing a quantity).

I then generate some random numbers according to Beta(0.2, 0.2) and I try
to estimate the shape parameter using

1) fitdistr from MASS
2) mle from stats4

Results: generally speaking I have non-sense estimates of the shape
parameter when I use dbeta2 instead of dbeta1 and I do not understand why.
On top of that, mle crashes with dbeta2 and often I have numerical problems
depending on how I seed the x sequence of random numbers.

I must be misunderstanding something, so any suggestion is appreciated.
Cheers

Lorenzo

#########################################################################

library(MASS)
library(stats4)

dbeta1 <- function(x, shape, ...)
       dbeta(x, shape, shape, ...)

dbeta2 <- function(x, shape){

    res <- x^(shape-1)*(1-x)^(shape-1)

    return(res)

}

LL1 <- function(shape){

    R <- dbeta1(x, shape)

    res <- -sum(log(R))

    return(res)

}

LL2 <- function(shape){

    R <- dbeta2(x, shape)

    res <- -sum(log(R))

    return(res)

}

set.seed(124)

x <-rbeta(1000, 0.2, 0.2)

fit_dbeta1 <- fitdistr( x , dbeta1, start=list(shape=0.5) , method="Brent",
lower=c(0), upper=c(1))

print("estimate of shape from fit_dbeta1 is")
print(fit_dbeta1$estimate)

fit_dbeta2 <- fitdistr( x , dbeta2, start=list(shape=0.5) , method="Brent",
lower=c(0), upper=c(1))

print("estimate of shape from fit_dbeta2 is")
print(fit_dbeta2$estimate)

fit_LL1 <- mle(LL1, start=list(shape=0.5))

print("estimate of from fit_LL1")
print(summary(fit_LL1))

## this does not work

fit_LL2 <- mle(LL2, start=list(shape=0.5))

	[[alternative HTML version deleted]]