[BioC] understanding ACME

Sean Davis sdavis2 at mail.nih.gov
Mon Feb 4 20:44:33 CET 2008


On Feb 4, 2008 1:50 PM, Ramon Diaz-Uriarte <rdiaz at cnio.es> wrote:
> hits=-2.6 testsºYES_00
> X-USF-Spam-Flag: NO
>
> Dear All,
>
> I am trying to understand how the ACME package (by Sean Davis) works, but I
> think there is something I am missing about the way the p-values are
> computed. It seems when I try to do the chi-square test myself, I always
> overestimate the p-value.
>
>
> The following is a complete example:
>
> silly.dat <- c(2, 1, 5, 3, 6, 4)
> dummy.data <- new("aGFF", data = matrix(silly.dat, ncol = 1),
>                   annotation = data.frame(Chromosome = 1,
>                                Location = c(1, 10, 20, 1000, 1200, 1300)),
>                   samples = data.frame(SampleID = 1))
>
> ### So we have
> cbind(silly.dat, Position = c(1, 10, 20, 1000, 1200, 1300))
>      silly.dat Position
> [1,]         2        1
> [2,]         1       10
> [3,]         5       20
> [4,]         3     1000
> [5,]         6     1200
> [6,]         4     1300
>
>
> do.aGFF.calc(dummy.data, window = 110, thresh = 0.90)
> ### Cutpoints:
> ### [1] 5.5
> ### Thus, all values except 6 (the fifth value) are below the threshold
>
>
> ## Within the window for fifth value we have:
> ## only value of 6
> do.aGFF.calc(dummy.data, window = 10, thresh = 0.90)@vals[5] ## 0.0877
> chisq.test(x = c(0, 1), p = as.vector(table(silly.dat > 5.5)/6)) ## 0.02535
>
> ## value of 6 and 4
> do.aGFF.calc(dummy.data, window = 202, thresh = 0.90)@vals[5] ## 0.3458
> chisq.test(x = c(1, 1), p = as.vector(table(silly.dat > 5.5)/6)) ## 0.2059
>
> ## values 6, 4, 3
> do.aGFF.calc(dummy.data, window = 402, thresh = 0.90)@vals[5] ## 0.571
> chisq.test(x = c(2, 1), p = as.vector(table(silly.dat > 5.5)/6)) ## 0.4386
>
>
> Where am I computing the chisq in the wrong way?

Hi, Ramon.  ACME is simply calculating the chi-square on a 2x2 table
where the cells have the values defined like so:

a=number of probes on the array above the threshold
b=number of probes total on the array NOT above the threshold
c=number of probes in the window above the threshold
d=number of probes in the window NOT above the threshold

So, building on your example above in which a=1,b=5,c=1, and d varies as below:

## d=0
chisq.test(x=matrix(c(1,5,1,0),nc=2),correct=FALSE) ## 0.08767

## d=1
chisq.test(x=matrix(c(1,5,1,1),nc=2),correct=FALSE) ## 0.3458

## d=2
chisq.test(x=matrix(c(1,5,1,2),nc=2),correct=FALSE) ## 0.5708

Does this explanation help?

Sean


More information about the Bioconductor mailing list