[Rd] Infinite degrees of freedom for F-distribution

Sat Apr 23 02:31:45 CEST 2005

On Fri, April 22, 2005 7:36 pm, Peter Dalgaard said:
> Gordon Smyth <smyth at wehi.edu.au> writes:
>
>> This is just a suggestion/wish that it would be nice for the
>> F-distribution functions to recognize limiting cases for infinite
>> degrees of freedom, as the t-distribution functions already do.
>>
>> The t-distribution functions recognize that df=Inf is equivalent to
>> the standard normal distribution:
>>
>>  > pt(1,df=Inf)
>> [1] 0.8413447
>>  > pnorm(1)
>> [1] 0.8413447
>>
>> On the other hand, pf() will accept Inf for df1, but returns the wrong result:
>>
>>  > pf(1,df1=Inf,df2=1)
>> [1] 1
>>
>> whereas the correct limiting value is
>>
>>  > pchisq(1,df=1,lower.tail=FALSE)
>> [1] 0.3173105
>>
>> pf() returns NaN when df2=Inf:
>>
>>  > pf(1,df1=1,df2=Inf)
>> [1] NaN
>> Warning message:
>> NaNs produced in: pf(q, df1, df2, lower.tail, log.p)
>>
>> although the correct value is available as
>>
>>  > pchisq(1,df=1)
>> [1] 0.6826895
>>
>>
>> Gordon
>>
>>  > version
>>           _
>> platform i386-pc-mingw32
>> arch     i386
>> os       mingw32
>> system   i386, mingw32
>> status
>> major    2
>> minor    1.0
>> year     2005
>> month    04
>> day      18
>> language R
>
> This is actually a regression. It worked as you suggest in 2.0.1, at
> least on Linux. Also, somewhat disturbing,
>
>> pf(1,df1=1,df2=Inf)
> [1] NaN
> Warning message:
> NaNs produced in: pf(q, df1, df2, lower.tail, log.p)
>> pf(1,df1=1,df2=99999999)
> [1] 0.6826895
>> pf(1,df1=1,df2=999999999999)
> [1] 0.6826841
>> pf(1,df1=1,df2=99999999999999999999)
> [1] 0
>
> (notice that the middle case has actually begun to diverge from the
> limiting value)
>
> --
>    O__  ---- Peter Dalgaard             Blegdamsvej 3
>   c/ /'_ --- Dept. of Biostatistics     2200 Cph. N
>  (*) \(*) -- University of Copenhagen   Denmark      Ph: (+45) 35327918
> ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)             FAX: (+45) 35327907
>

You're right, it worked ok in R 2.0.1 in Windows as well.  I guess the new behaviour is associated
with the new algorithm for pbeta() in R 2.1.0, which will affect a number of functions besides
pf().  The new algorithm is faster for large shape parameters but seems to have accuracy problems:

> a <- 100000000; pbeta(0.5,a,a)
[1] 0.5
> a <- 1000000000; pbeta(0.5,a,a)
[1] 0.4999999
> a <- 10000000000; pbeta(0.5,a,a)
[1] 0
> pbeta(0.5,Inf,Inf)
[1] 0

Gordon