[R-sig-eco] testing for distribution

Peter Solymos solymos at ualberta.ca
Wed May 13 20:37:36 CEST 2009


Dear Jacob,

Erika was right, you just have to perform a goodness of fit test. Bit
it is easier
to inspect your residual deviance.
It follows a Chi-sqared distribution, where the expected value should
be close to
the degrees of freedom if the fit is good. To get a P value for an
object of class
"negbin" (inheriting from glm and lm), use (note, H0: the fit is good):

library(MASS)
mod <- glm.nb(...your model...)
1-pchisq(mod$deviance, mod$df.residual)

If you are using other functions (i.e. in package pscl), the structure
of the returned object might change,
in this case simply type the numbers instead.

Cheers,

Péter

Péter Sólymos, PhD
Postdoctoral Fellow
Department of Mathematical and Statistical Sciences
University of Alberta
Edmonton, Alberta, T6G 2G1
Canada
email <- paste("solymos", "ualberta.ca", sep = "@")



On Wed, May 13, 2009 at 12:17 PM, Erika Mudrak <mudrak at wisc.edu> wrote:
> Jacob-  You can use a Chi-squared goodness of fit - chisq.test() for discrete distributions like the negative binomial and a Kolmogorov-Smirnoff test- ks.test() for continuous distributions.      They will both produce a p-value which tests the null hypothesis that your data come from the given distribution with stated parameters.    Use the parameter estimates from your fitdistr() results. So if p>0.05 (or 0.1 or whatever), your data come from that distribution.
>
> For Discrete distributions, try something like:
> fit=fitdistr(.....)
> chisq.test(x=ActualData, y=rnbinom(n=length(ActualData), k=fit.k, mu=fit.mu))
> #I think this is right, I haven't actually tried it...
> # This is akin to quantitatively comparing your histograms...
>
>
> For continous distributions (such as beta), the code would be this:
> fit=fitdistr(...)
> ks.test(ActualData, "pbeta", shape1=fit$estimate[1],shape2=fit$estimate[2])
> # I've done this successfully
>
> You can use AIC to test if another distribution fits your data better than negative binomial does.  I think it's possible for your data to "pass" the Chi-Squared/Kolmogorov-Smirnoff test for two different distributions, but it will fit one better than another.
>
> Erika Mudrak
>
>
> -------------------------------------------
> Erika Mudrak
> Graduate Student
> Department of Botany
> University of Wisconsin-Madison
> 430 Lincoln Dr
> Madison WI, 53706
> 608-265-2191
> mudrak at wisc.edu
>
> ----- Original Message -----
> From: "Capelle, Jacob" <Jacob.Capelle at wur.nl>
> Date: Tuesday, May 12, 2009 11:00 am
> Subject: [R-sig-eco]  testing for distribution
> To: r-sig-ecology at r-project.org
>
>
>> Dear all,
>>
>> I have a kind of a theoretical question from which I hope it might
>> interest you and hopefully can help me a bit.
>>
>> In order to obtain ecological (surrvey) data, I try to make a
>> prediction about the accuracy of a sampling tool to estimate mussel
>> density. For this reason I took a lot of samples at a certain fixed
>> location and counted the amount of mussels in each sample. Because
>> mussels are aggregated on the sediment, I had a lot of zero values. To
>> estimate the sample size I used a binomial distribution and obtained
>> the k value and the mu from the fitdistr(x,"negative binomial") (MASS).
>>
>> The question I have is: how can I test if this distribution accurately
>> described my (zero inflated count) data?
>>
>> I am a bit familiar with the AIC but since I only have counts on one
>> variable I cannot perform a GLS.
>> Creating a vector with rnbinom() using the k and mu from the
>> fitdistr() I plotted a histogram and compared it with my data, this
>> showed that is was roughly comparable, but I want to quantify this.
>>
>> I have a biological background not a statistical one, so I realize I
>> can ask silly questions.
>> But I hope someone can give me some hints.
>>
>> Kind regards,
>>
>> Jacob Capelle
>>
>> PhD student
>> Wageningen Imares
>> The Netherlands
>> jacob.capelle at wur.nl <
>>
>> _______________________________________________
>> R-sig-ecology mailing list
>> R-sig-ecology at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>
> _______________________________________________
> R-sig-ecology mailing list
> R-sig-ecology at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>
>



More information about the R-sig-ecology mailing list