[R] efficient R code

Knut M. Wittkowski kmw at mail.rockefeller.edu
Wed Feb 9 16:29:41 CET 2005


Last Friday, Gregory Chaitin (http://www.umcs.maine.edu/~chaitin/lm.html) 
mentioned that there can be no proof that a given code is the shortest for 
a problem, even within a language. Still, the script below, a replacement 
of the "TDT", one of the most frequently used tests in genetics 
(http://mustat.rockefeller.edu under "downloads") may get close. It 
contains a few additional bytes for clarity, as in (2^1) for 2, but, 
otherwise, I don't think one could make this much shorter, especially the 
part that does the exact distribution.

I'm looking forward to comments on improving the programming efficiency for 
this problem. (The "return(...)" seems to be necessary in R only.)

Knut

#--------------------------------------------------------------------------
# asymp.SMN.pvalue(pP,qP,pX,qX,pQ,qQ)
# exact.SMN.pvalue(pP,qP,pX,qX,pQ,qQ)
#--------------------------------------------------------------------------
# pP,qP = number of PP,PQ children of PP~PQ parents
# pX,qX = number of PP,QQ children of PQ~PQ parents
# pQ,qQ = number of PQ,QQ children of PQ~QQ parents
#--------------------------------------------------------------------------

O3 <- function(X1,X2,X3,Op) matrix(outer(outer(X1,X2,Op),X3,Op))
b0 <- function(n) if (n==0) 1 else dbinom(0:n, n, .5)

Dst <- function(nP,   nX,   nQ   ) O3(b0(nP),(2^0)*b0(nX),b0(nQ),"*")
Est <- function(pP,qP,pX,qX,pQ,qQ) O3(pP-qP, (2^1)*(pX-qX),pQ-qQ,"+") # (1)
Var <- function(pP,qP,pX,qX,pQ,qQ) O3(pP+qP, (2^2)*(pX+qX),pQ+qQ,"+") # (2)

asymp.SMN.pvalue <- function(...)  1-pchisq( Est(...)^2/Var(...) ,1)  # (3)

exact.SMN.pvalue <- function(pP,qP,pX,qX,pQ,qQ)
{
         tb <- cbind(
                 Dst(nP<-pP+qP, nX<-pX+qX, nQ<-pQ+qQ),
                 Est(0:nP,nP:0, 0:nX,nX:0, 0:nQ,nQ:0)^2)

         return(1-sum(tb[tb[,2]<c(Est(pP,qP,pX,qX,pQ,qQ)^2),1]))
}
#--------------------------------------------------------------------------
# Note: The numbers in parentheses in inline comments refer to the equation
#       numbers in
#       Wittkowski KM, Liu X
#       A Statistically Valid Alternative to the TDT
#       Hum Hered 2002, 54:157-164; 2004, 58:60-61 (discussion)
#--------------------------------------------------------------------------


Knut M. Wittkowski, PhD,DSc
------------------------------------------
The Rockefeller University, GCRC
Experimental Design and Biostatistics
1230 York Ave #121B, Box 322, NY,NY 10021
+1(212)327-7175, +1(212)327-8450 (Fax)
kmw at rockefeller.edu
http://www.rucares.org/clinicalresearch/dept/biometry/




More information about the R-help mailing list