[R] fitting distributions using fitdistr (MASS)

David Winsemius dwinsemius at comcast.net
Wed May 4 07:30:07 CEST 2011


On May 3, 2011, at 10:03 PM, Usha wrote:

> Thanks for the help.
> I would like to explain my problem.
> I have sample of scores from tests which varies form 0 to 35.
> Now, i want to find out the best fit distribution for this data. I  
> need to
> order the distributions based on their best fit.
> For this i am using the function fitdistr(). [One of the Ref.used :  
> FITTING
> DISTRIBUTIONS WITH R by Vito Ricci. ]
>
> Example:
>> scores<-sample(0:35,500,replace=T)
>> normalfit<-fitdistr(scores,"normal")
>> normalfit
>    mean          sd
> 16.8460000   10.1361869
> ( 0.4533041) ( 0.3205344)
>> normalfit$loglik
> [1] -1867.525
>> kstestnormal<-ks.test(scores,"pnorm",16.8460000, 10.1361869) # for  
>> the
> measure of goodness
>
>
> 1) Am i doing the right thing?

No. The most important right thing you are not doing is describing  
your goals. Clearly you do _not_ want the best fitting distribution,  
since the best fit distribution would be a multinomial distribution  
with whatever probabilities would exactly fit the sample.

> 2) If yes, can't i follow the same procedure for all the distributions
> supported by fitdistr? With the start values wherever necessary?

You can do anything you want. But have you considered the power of  
this method and the error rates? Is there no science behind this to  
guide what is so far an aimless search strategy?

> 3) Do I have to consider/worry about the warnings that I get?

We cannot force you to heed the warnings.


--
David Winsemius, MD
Heritage Laboratories
West Hartford, CT



More information about the R-help mailing list