[BioC] Siggenes and interpreting SAM output
Ettinger, Nicholas
nicholas-ettinger at uiowa.edu
Wed Jan 18 19:04:38 CET 2006
Hello all!
I am having trouble interpreting whether my SAM outputs are valid enough
that I should take them seriously, or whether to ignore them and try
another method.
Here is a sample of some of my output (samples are two-class paired
samples; infected cells vs. non-infected cells with 3 different human
donors for the cells; normalized with GCRMA):
Gene set#1
D-value Q-value R-fold
Gene 1 -179.854 0.410743 0.598795
Gene 2 -84.0229 0.417071 0.775385
Gene 3 -82.5916 0.417071 0.858212
Gene set#2
D-value Q-value R-fold
Gene 4 86.7573 0.152039 1.00977
Gene 5 86.3523 0.152039 1.09908
Gene 6 -83.4252 0.152039 0.547529
How do I think about these results?
I have several questions:
(1) I am not too clear why the D-values are so high/low but the R-fold
numbers are not bigger/smaller. I realize that the D-values are
generated from the obs d(i) vs. the expect d(i) but I thought that this
kind of related to the fold change?
(2) For gene set #1, would the correct interpretation be that these
genes are changing by "large" amounts but that since the q-values are so
high (the FDR was around 0.4), they are not reliable?
(3) Similarly for gene set #2, would the correct interpretation be that
since the q-values are much lower (the FDR was about 0.2), one could be
more confident that these are real changes that are being picked up?
And if you were comfortable with a 20% chance that the genes you chose
were falsely positive then you could move on to verify them?
(4) What is being accepted for publication in terms of FDRs and q-values
and such? It seems to me that that is really the defining answer. How
low do the FDRs and q-values have to be before editors of journals will
take the results seriously in people's experience?? Suggestions and/or
advice here would be most welcome.
Thank you!!
---Nick
University of Iowa
More information about the Bioconductor
mailing list