[BioC] multtest MTP raw P values = 0
Timur Shtatland
tshtatland at MGH.HARVARD.EDU
Fri Feb 29 19:37:28 CET 2008
Dear all,
I used MTP from the multtest package to compute bootstrap based raw
and adjusted t-test P values (group 1 = 7 samples, group 2 = 3
samples). Why are some raw P values in the output exactly zero?
I could not change this by using scientific notation, with varying
number of significant digits. In the example below, 10 digits are
used; increasing this does not make a difference. I expected all P
values > 0, with the minimum P value determined by the number of
bootstrap iterations, here 1000. Hence the raw P value minimum should
be 1e-03, as indeed the lowest *positive* P values are.
Another question is: why some raw P values that are *equal* correspond
to adjusted P values that are *different*?
For example, raw P = 1e-03 corresponds to adjusted P = 8.4e-02,
8.6e-02, etc.
I could not find the answer in the MTP help page, its vignettes, FAQ,
etc. A similar, but not identical, problem was reported on this
mailing list before, without a solution:
https://stat.ethz.ch/pipermail/bioconductor/2007-December/020492.html
Thank you for your help.
Best regards,
Timur
--
Timur Shtatland, PhD
Center for Molecular Imaging Research
Massachusetts General Hospital
149 13th Street, Room 5408
Charlestown, MA 02129
tshtatland at mgh dot harvard dot edu
############################################################
> B <- 1000
> seed <- 99
> TTBoot <- MTP(X=esetGcrmaExprsFiltered, Y=TT, test =
"t.twosamp.unequalvar", alternative = "two.sided", typeone="fdr",
method="ss.maxT", fdr.method="conservative", B=B, seed=seed)
running bootstrap...
iteration = 100 200 300 400 500 600 700 800 900 1000
> print(TTBoot)
Multiple Testing Procedure
Object of class: MTP
sample size = 10
number of hypotheses = 5413
test statistics = t.twosamp.unequalvar
type I error rate = fdr
nominal level alpha = 0.05
multiple testing procedure = ss.maxT
Call: MTP(X = esetGcrmaExprsFiltered, Y = TT, test =
"t.twosamp.unequalvar",
alternative = "two.sided", typeone = "fdr", fdr.method =
"conservative",
B = B, method = "ss.maxT", seed = seed)
Slots:
Class Mode Length Dimension
statistic numeric numeric 5413
estimate numeric numeric 5413
sampsize integer numeric 1
rawp numeric numeric 5413
adjp numeric numeric 5413
conf.reg array logical 0 0,0,0
cutoff matrix logical 0 0,0
reject matrix logical 5413 5413,1
nulldist matrix numeric 5413000 5413,1000
call call call 10
seed integer numeric 1
...
> MTPRes <- data.frame(PRaw=TTBoot at rawp, PAdj=TTBoot at adjp)
> print(format(subset(MTPRes[order(MTPRes$PRaw, MTPRes$PAdj), ],
PAdj < 0.1), digits=10, scientific=TRUE))
PRaw PAdj
202508_s_at 0e+00 0.000000000e+00
203130_s_at 0e+00 0.000000000e+00
206780_at 0e+00 0.000000000e+00
218820_at 0e+00 0.000000000e+00
205825_at 0e+00 8.000000000e-03
203000_at 0e+00 1.400000000e-02
204465_s_at 0e+00 1.400000000e-02
206282_at 0e+00 1.400000000e-02
206502_s_at 0e+00 1.400000000e-02
208399_s_at 0e+00 1.400000000e-02
209598_at 0e+00 1.400000000e-02
209755_at 0e+00 1.400000000e-02
212805_at 0e+00 1.400000000e-02
203001_s_at 0e+00 2.400000000e-02
204811_s_at 0e+00 2.400000000e-02
205174_s_at 0e+00 2.400000000e-02
205646_s_at 0e+00 2.400000000e-02
206051_at 0e+00 2.400000000e-02
212624_s_at 0e+00 2.800000000e-02
204035_at 0e+00 3.000000000e-02
221261_x_at 0e+00 3.200000000e-02
206915_at 0e+00 3.400000000e-02
206104_at 0e+00 4.000000000e-02
210036_s_at 0e+00 5.000000000e-02
218380_at 0e+00 5.000000000e-02
213135_at 0e+00 6.200000000e-02
204870_s_at 0e+00 6.400000000e-02
218952_at 0e+00 6.400000000e-02
205120_s_at 0e+00 6.600000000e-02
211597_s_at 0e+00 6.666666667e-02
205348_s_at 0e+00 6.800000000e-02
212190_at 0e+00 7.000000000e-02
213186_at 0e+00 7.000000000e-02
203485_at 0e+00 7.200000000e-02
203572_s_at 0e+00 8.400000000e-02
209583_s_at 0e+00 8.400000000e-02
212895_s_at 0e+00 8.400000000e-02
206001_at 0e+00 8.600000000e-02
206135_at 0e+00 8.600000000e-02
212843_at 0e+00 8.600000000e-02
204059_s_at 0e+00 8.800000000e-02
210826_x_at 0e+00 9.000000000e-02
213438_at 0e+00 9.000000000e-02
218675_at 0e+00 9.000000000e-02
201116_s_at 0e+00 9.200000000e-02
203272_s_at 0e+00 9.600000000e-02
204793_at 0e+00 9.800000000e-02
204720_s_at 1e-03 8.400000000e-02
221207_s_at 1e-03 8.400000000e-02
204540_at 1e-03 8.600000000e-02
209992_at 1e-03 8.888888889e-02
210246_s_at 1e-03 9.000000000e-02
> print(dim(esetGcrmaFiltered))
Features Samples
5413 10
> sessionInfo()
R version 2.6.0 (2007-10-03)
powerpc-apple-darwin8.10.1
locale:
en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] splines tools stats graphics grDevices utils
datasets methods base
other attached packages:
[1] multtest_1.18.0 affydata_1.11.3 affyQCReport_1.16.0
geneplotter_1.16.0 lattice_0.16-5
[6] annotate_1.16.1 AnnotationDbi_1.0.6 RSQLite_0.6-4
DBI_0.2-4 RColorBrewer_1.0-2
[11] affyPLM_1.14.0 xtable_1.5-2 simpleaffy_2.14.05
gcrma_2.10.0 matchprobes_1.10.0
[16] genefilter_1.16.0 survival_2.32 affy_1.16.0
preprocessCore_1.0.0 affyio_1.6.1
[21] Biobase_1.16.1
loaded via a namespace (and not attached):
[1] KernSmooth_2.22-21 grid_2.6.0
> version
_
platform powerpc-apple-darwin8.10.1
arch powerpc
os darwin8.10.1
system powerpc, darwin8.10.1
status
major 2
minor 6.0
year 2007
month 10
day 03
svn rev 43063
language R
version.string R version 2.6.0 (2007-10-03)
# platform: PowerPC G4, Mac OS 10.4.11.
The information transmitted in this electronic communica...{{dropped:10}}
More information about the Bioconductor
mailing list