[R] strange behavior of lchoose in combinatorics problem

Stefan Evert stefanML at collocations.de
Sun Jun 26 14:35:15 CEST 2016


Why do you want to do this? Why not simply use Fisher's exact test?

N <- 2178
N1 <- 165
N2 <- 331
J <- 97
ct <- rbind(c(J, N1-J), c(N2-J, N-N1-N2+J))
fisher.test(ct)

Background explanation:

 - Your formula computes the log hypergeometric probability for a contingency table as ct above, but with k instead of J.

 - It does so in an unnecessarily complicated way: three terms would be enough (cf. the equation at http://www.collocations.de/AM/section3.html; with C1=N1, C2=N-C1, R1=N2).

 - If you want to test for a positive association between the two animals, you should be adding up the probabilities for k >= J to obtain a p-value, rather than k <= J (what would this sum of probabilities tell you?).

 - lpvec doesn't contain probabilities, but log probabilities. What sense would there be in adding those up? In any case, you should obtain a negative value because all the individual logs are negative.

Best,
Stefan



> On 25 Jun 2016, at 16:13, Gonçalo Ferraz <gferraz29 at gmail.com> wrote:
> 
> I am working on interactions between animals, studying whether animal 1 is attracted to animal 2 (or vice-versa). I looked for the two animals in N=2178 sampling occasions, finding animal 1 a total of N1=165 times, and animal 2 a total of N2=331 times. In J=97 occasions, I saw both animals at the same time. 
> 
> The more frequently I see the two animals in the same sampling occasion, the more I will believe that they are attracted to each other. Therefore, I want to calculate the probability of finding J<=97 when the two animals are moving around independently of each other. The higher this probability, the stronger the attraction.
> 
> Following Veech (Journal of Biogeography 2014, 41: 1029-1035) I compute the log probability of obtaining a number n of encounters between animals as ‘lpn’ in the function lveech:
> 



More information about the R-help mailing list