[BioC] siggenes SAM FDR, # false discrepancies

Mon Jun 6 05:01:27 CEST 2005

I know there have been several messages comparing Excel SAM vs  
siggenes SAM, and several others asking about how the FDR is  
calculated in siggenes SAM, but none of these have answered a  
question I have about what to believe in the SAM output table.  My  
table from siggenes:

Delta	p0	FALSE	Called	FDR
0.25	0.147	2023.5	4220	0.07
0.5	0.147	1211.5	2686	0.066
0.75	0.147	789	1617	0.072
1	0.147	390	866	0.066
1.25	0.147	197	496	0.058
1.5	0.147	120.5	337	0.053
1.75	0.147	69.5	228	0.045
2	0.147	38.5	150	0.038
2.25	0.147	22	94	0.034
2.5	0.147	13	65	0.029
2.75	0.147	7	30	0.034
3	0.147	4	16	0.037
3.25	0.147	1	7	0.021
3.5	0.147	0.5	4	0.018
3.75	0.147	0	0	0

The FDR here is Pi hat * (false/called).  I'm not sure what that is  
supposed to mean.  Which number of false am I supposed to believe?   
The number false as calculated by multiplying the FDR by the #  
called? (This makes sense to me, for example:  0.07*4220=297.5  
false)  Or the # false as reported in the false column? (This doesn't  
make sense to me...what's the point of the FDR as calculated if it  
doesn't jibe with the # false and the # called?)

Excel SAM seems to circumvent this problem by multiplying the number  
false by Pi hat (and reporting only that product, not the number  
false before being multiplied), and then calculating the FDR as false/ 
called, this FDR then implicitly has Pi hat in it (or so it seems to  
me).  This way, # false, # called, and the observed FDR all  
correspond correctly, unlike siggenes SAM.

Thanks in advance for any help.

--Jake

	[[alternative HTML version deleted]]