[R] Problem with fitdistr for beta

Spencer Graves spencer.graves at pdf.com
Sun Jul 6 18:07:41 CEST 2003


The following adjusts only observations that are numerically exactly 0 
or 1.  It will doubtless introduce some bias.  However, the bias 
introduced will be substantially less than the bias introduced by my 
earlier suggestion.

 > a <- rbeta(100,0.1,0.1)
 > fitdistr(x=a, "beta", start=list(shape1=0.1,shape2=0.1))
Error in optim(start, mylogfn, x = x, hessian = TRUE, ...) :
	Function cannot be evaluated at initial parameters
 >
 > range(a) - (0:1)
[1] 9.62532e-20 0.00000e+00
 > sum(a==1)
[1] 1
 >
 > a1 <- a
 > a1[a==0] <- .Machine$double.eps
 > a1[a==1] <- (1-.Machine$double.eps)
 > fitdistr(x=a1, "beta", start=list(shape1=0.1,shape2=0.1))
      shape1        shape2
   0.087937990   0.081524037
  (0.010950667) (0.009899447)

You could also program your own "densfun" argument to "fitdistr" for a 
mixture of a discrete distribution with point masses at 0 and 1 with the 
rest following a standard beta distribution.  However, I would not 
recommend that for what I understand of your application.

hope this helps.  spencer graves

alobo at ija.csic.es wrote:
> Thanks Prof. B. Ripley and
> Spencer Graves for your help.
> 
> According to the last message from 
> BR, perhaps I should use another
> distribution or combination
> of distributions to model my data,
> which look like a beta(0.1,0.1)
> but can have both 0 and 1. Any
> suggestion?
> 
> (regarding the extra "1)" in my
> original message, it was just a pasting
> problem.)
> 
> Agus
> 
> Mensaje citado por Prof Brian Ripley <ripley at stats.ox.ac.uk>:
> 
> 
>>In this example shrinking by (1 - 2e-16) leads to a significant change in 
>>the distribution: see my probability calculation.  And you can't shrink by 
>>much less.  A beta(0.1, 0.1) is barely a continuous distribution.
>>
>>On Fri, 4 Jul 2003, Spencer Graves wrote:
>>
>>
>>>My standard work-around for the kind of problem you identified is to 
>>>shrink the numbers just a little towards 0.5.  For example:
>>>
>>> > library(MASS)
>>> > a <- rbeta(100,0.1,0.1)
>>> > fitdistr(x=a, "beta", start=list(shape1=0.1,shape2=0.1))
>>>Error in optim(start, mylogfn, x = x, hessian = TRUE, ...) :
>>>	Function cannot be evaluated at initial parameters
>>>
>>> > r.a <- range(a)
>>> > c0 <- 0
>>> > c1 <- 1
>>> > if(r.a[1]==0)c0 <- min(a[a>0])
>>> > if(r.a[2]==1)c1 <- max(a[a<1])
>>> > c. <- c(c0, 1-c1)
>>> > if(any(c.>0))c. <- min(c.[c.>0])
>>> > c.
>>>[1] 2.509104e-14
>>> > fitdistr(x=0.5*c.[1] + (1-c.[1])*a, "beta", 
>>>start=list(shape1=0.1,shape2=0.1))
>>>      shape1       shape2
>>>   0.08728921   0.10403875
>>>  (0.01044863) (0.01320188)
>>> >
>>>hope this helps.  spencer graves
>>>
>>>Prof Brian Ripley wrote:
>>>
>>>>rbeta(100,0.1,0.1) is generating samples which contain 1, an impossible 
>>>>value for a beta and hence the sample has an infinite log-likelihood.
>>>>It is clearly documented on the help page that the range is 0 < x < 1.
>>>>However, that is not so surprising as P(X > 1-1e-16) is about 1% and
>>>
>>hence 
>>
>>>>values will get rounded to one.
>>>>
>>>>The same would happen for a value of 0.
>>>>
>>>>Your code is syntactically incorrect, at least as received here.
>>>>
>>>>On Fri, 4 Jul 2003, Agustin Lobo wrote:
>>>>
>>>>
>>>>
>>>>>I have the following problem:
>>>>>
>>>>>I have a vector x of data (0<x<=1 ) with
>>>>>a U-shaped histogram and try to fit a beta
>>>>>distribution using  fitdistr. In fact,
>>>>>hist(rbeta(100,0.1,0.1)) looks a lot like
>>>>>my data.
>>>>>
>>>>>The equivalent to
>>>>>the example in the manual
>>>>>sometimes work:
>>>>>
>>>>>
>>>>>
>>>>>>a <- rbeta(100,0.1,0.1)
>>>>>>fitdistr(x=a, "beta", start=list(shape1=0.1,shape2=0.1))1)
>>>>>>    shape1       shape2
>>>>>
>>>>> 0.09444627   0.12048753
>>>>>(0.01120670) (0.01550129)
>>>>>
>>>>>but sometimes does not:
>>>>>
>>>>>
>>>>>>a <- rbeta(100,0.1,0.1)
>>>>>>fitdistr(x=a, "beta", start=list(shape1=0.1,shape2=0.1))1)
>>>>>>Error in optim(start, mylogfn, x = x, hessian = TRUE, ...) :
>>>>>
>>>>>       Function cannot be evaluated at initial parameters
>>>>>
>>>>>Unfortunately, my data fall in the second case
>>>>>
>>>>>I've searched for any weird value that be present in the
>>>>>cases in which fitdistr exits with the error message, but
>>>>>could not find any.
>>>>>
>>>>>Any help?
>>>>>(please if anyone answers be sure to answer to my address as well,
>>>>>I cannot subscribe to the list)
>>>>>
>>>>>Thanks
>>>>>
>>>>>Agus
>>>>>
>>>>>Dr. Agustin Lobo
>>>>>Instituto de Ciencias de la Tierra (CSIC)
>>>>>Lluis Sole Sabaris s/n
>>>>>08028 Barcelona SPAIN
>>>>>tel 34 93409 5410
>>>>>fax 34 93411 0012
>>>>>alobo at ija.csic.es
>>>>>
>>>>>______________________________________________
>>>>>R-help at stat.math.ethz.ch mailing list
>>>>>https://www.stat.math.ethz.ch/mailman/listinfo/r-help
>>>>>
>>>>
>>>>
>>>______________________________________________
>>>R-help at stat.math.ethz.ch mailing list
>>>https://www.stat.math.ethz.ch/mailman/listinfo/r-help
>>>
>>
>>-- 
>>Brian D. Ripley,                  ripley at stats.ox.ac.uk
>>Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
>>University of Oxford,             Tel:  +44 1865 272861 (self)
>>1 South Parks Road,                     +44 1865 272866 (PA)
>>Oxford OX1 3TG, UK                Fax:  +44 1865 272595
>>
>>
> 
> 
> 
> 
> 
> -------------------------------------------------
> This mail sent through IMP: http://horde.org/imp/
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://www.stat.math.ethz.ch/mailman/listinfo/r-help




More information about the R-help mailing list