[Rd] Infinite degrees of freedom for F-distribution
Gordon K Smyth
smyth at wehi.EDU.AU
Sat Apr 23 02:31:45 CEST 2005
On Fri, April 22, 2005 7:36 pm, Peter Dalgaard said:
> Gordon Smyth <smyth at wehi.edu.au> writes:
>
>> This is just a suggestion/wish that it would be nice for the
>> F-distribution functions to recognize limiting cases for infinite
>> degrees of freedom, as the t-distribution functions already do.
>>
>> The t-distribution functions recognize that df=Inf is equivalent to
>> the standard normal distribution:
>>
>> > pt(1,df=Inf)
>> [1] 0.8413447
>> > pnorm(1)
>> [1] 0.8413447
>>
>> On the other hand, pf() will accept Inf for df1, but returns the wrong result:
>>
>> > pf(1,df1=Inf,df2=1)
>> [1] 1
>>
>> whereas the correct limiting value is
>>
>> > pchisq(1,df=1,lower.tail=FALSE)
>> [1] 0.3173105
>>
>> pf() returns NaN when df2=Inf:
>>
>> > pf(1,df1=1,df2=Inf)
>> [1] NaN
>> Warning message:
>> NaNs produced in: pf(q, df1, df2, lower.tail, log.p)
>>
>> although the correct value is available as
>>
>> > pchisq(1,df=1)
>> [1] 0.6826895
>>
>>
>> Gordon
>>
>> > version
>> _
>> platform i386-pc-mingw32
>> arch i386
>> os mingw32
>> system i386, mingw32
>> status
>> major 2
>> minor 1.0
>> year 2005
>> month 04
>> day 18
>> language R
>
> This is actually a regression. It worked as you suggest in 2.0.1, at
> least on Linux. Also, somewhat disturbing,
>
>> pf(1,df1=1,df2=Inf)
> [1] NaN
> Warning message:
> NaNs produced in: pf(q, df1, df2, lower.tail, log.p)
>> pf(1,df1=1,df2=99999999)
> [1] 0.6826895
>> pf(1,df1=1,df2=999999999999)
> [1] 0.6826841
>> pf(1,df1=1,df2=99999999999999999999)
> [1] 0
>
> (notice that the middle case has actually begun to diverge from the
> limiting value)
>
> --
> O__ ---- Peter Dalgaard Blegdamsvej 3
> c/ /'_ --- Dept. of Biostatistics 2200 Cph. N
> (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918
> ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
>
You're right, it worked ok in R 2.0.1 in Windows as well. I guess the new behaviour is associated
with the new algorithm for pbeta() in R 2.1.0, which will affect a number of functions besides
pf(). The new algorithm is faster for large shape parameters but seems to have accuracy problems:
> a <- 100000000; pbeta(0.5,a,a)
[1] 0.5
> a <- 1000000000; pbeta(0.5,a,a)
[1] 0.4999999
> a <- 10000000000; pbeta(0.5,a,a)
[1] 0
> pbeta(0.5,Inf,Inf)
[1] 0
Gordon
More information about the R-devel
mailing list