[R] p-values < 2.2e-16 not reported
Will Eagle
will.eagle at gmx.net
Thu May 20 00:31:26 CEST 2010
Dear all,
thanks for your feedback so far. With the help of a colleague I think I
found the solution to my problem:
> pt(10,100,lower=FALSE)
[1] 4.950844e-17
IS *NOT* EQUAL TO
> 1-pt(10,100,lower=TRUE)
[1] 0
This means that R is capable of providing p-values < 2.2e-16, however,
if the value is used in a substraction or addition then the default
value of the machine epsilon .Machine$double.eps = 2.220446e-16 is
applied. This causes that all p-values smaller than this threshold are
set to zero. This problem applies also to other distribution functions
like pnorm() and others. For your information I would also like to quote
the relevant part of the R manual on .Machine$double.eps:
"the smallest positive floating-point number x such that 1 + x != 1. It
equals base^ulp.digits if either base is 2 or rounding is 0; otherwise,
it is (base^ulp.digits)/ 2. Normally 2.220446e-16."
Although different opinions were expressed on whether it makes sense to
differentiate p-values below the machine epsilon, in my opinion
different effect sizes should correspond with different p-values when
reporting statistical results. Additionally, in certain scientific
fields, eg genetics, where usually many tests are performed and simple
methods, eg Bonferroni method, are used to adjust for multiple testing,
it is important to know the exact size of the p-value.
Therefore, I would like to suggest that operations of the 2nd variant
(ie 1-pt(10,100,lower=TRUE)) should be deprecated to calculate p-values
and operations of the 1st variant (ie pt(10,100,lower=FALSE)) should be
used instead. Since I have seen the 2nd variant being frequently used
(also by very experienced R users) and I assume that it is hidden in
many statistical test functions, eg cor.test(), this issue seems to me
quite important.
Best regards,
Will
More information about the R-help
mailing list