[R] How to do poisson distribution test like this?

(Ted Harding) Ted.Harding at manchester.ac.uk
Tue Jul 28 12:42:11 CEST 2009


On 28-Jul-09 10:03:41, Mao Jianfeng wrote:
> Dear R-listers,
> I want to reperfrom a poisson distribution test that presented
> in a recent-published biological research paper (Plant Physiology
> 2008, vol 148, pp. 1189-1200). That test is about the occurrence
> number of a kind of gene in separate chromosomes.
> 
> For instance:
> 
> The observed gene number in chromosome A is 36.
> The expected gene number in chromosome A is 30.
> 
> Then, the authors got a probability 0.137 by distribution test on this
> trial. In this test, a Poisson distribution was used to determine the
> significance of the gene distribution.
> 
> Questions:
> How can I reperform this test in R?
> 
> Thank you in advance.
> Mao Jian-Feng
> Institue of Botany,
> CAS, China

Since it is not clear what test procedure they used, I have done
a couple of numerical experiments in R:

1. Compare the upper-tail probability of the POisson distribution
   with mean mu = 30 of the event that at least 36 are observed:

   1-ppois(36,30)
   # [1] 0.1196266

   Not quite the 0.137 that they got.

2. A similar comparison, using a Normal approximation to the Poisson
   (mean mu = 30, SD = sqrt(mu)):

   1 - pnorm(6/sqrt(30))
   # [1] 0.1366608

   which, after rounding, is exactly the 0.137 that they got.

So it seems they have used an upper-tail test based on the Normal
approximation to the Poisson distribution.

Method 1 (using the exact Poisson distribution) is preferable, since
it is accurate (given the assumption of Poisson distribution).
So that would, in principle, be the best way to do it in R (as
illustrated).

Possibly their adoption of Method 2 is based on a naive acceptance
of the "rule-of-thumb" from some textbook; or maybe their available
software does not offer ready access to the exact Poisson distribution
(which wouldn't happen if they used R -- see Method 1). As stated,
it is inaacurate compared with Method 1, so is not to be preferred.

However, if you need to reproduce their method (regardless of merit),
then use Method 2.

Hoping this helps,
Ted.

--------------------------------------------------------------------
E-Mail: (Ted Harding) <Ted.Harding at manchester.ac.uk>
Fax-to-email: +44 (0)870 094 0861
Date: 28-Jul-09                                       Time: 11:42:08
------------------------------ XFMail ------------------------------




More information about the R-help mailing list