[Rd] p.adjust(<NA>s), was 'Re: [BioC] limma and p-values'

Mon Jan 17 23:28:26 CET 2005

On Tue, January 18, 2005 8:02 am, Martin Maechler said:
>>>>>> "GS" == Gordon Smyth <smyth at wehi.edu.au>
>>>>>>     on Sun, 16 Jan 2005 19:55:35 +1100 writes:
>     GS> 3. Upper case values for method "BH" or "YH" are also
>     GS> accepted.
>
> I don't see why we'd want this.  The S language is
> case-sensitive and we don't want to lead people to believe
> that case wouldn't matter.

Well, people like to capitalize people's names, especially initials like BH and YH.  I'm happy
with whatever you think it appropriate.

>     GS> 5. p.adjust() now works columnwise on numeric
>     GS> data.frames (as does cumsum and friends).
>
> well, "cusum and friends" are either generic or groupgeneric
> (for the "Math" group) -- there's a Math.data.frame group
> method.
> This is quite different for p.adjust which is not generic and
> I'm not (yet?) convinced it should become so.
>
> People can easily use sapply(d.frame, p.adjust, method) if needed;
>
> In any case it's not in the spirit of R's OO programming to
> special case "data.frame" inside a function such as p.adjust

I'm happy with whatever you think is most in the spirit of R.  My reasoning was that p.adjust()
and cumsum() are both operators on R^n (Euclidean space of n-tuples of real numbers) to R^n, and
all such operators should behave in the same way as far as possible.  If you want to argue for a
consistent OO programming style, shouldn't every function be generic?

> I'm not sure yet if it wasn't worth to allow for other NA
> treatment, like the "treat as if 1" {which my code proposition
> was basically doing} or rather mre sophisticated procedure like
> "integrating" over all P ~ U[0,1] marginals for each missing
> value, approximating the integral possibly by "Monte-Carlo"
> even quasi random numbers.

Don't forget that "strong control" of FWER implies control over all combinations of TRUE/FALSE for
the null hypotheses.  So you can't assume that all the hypotheses for the NAs are FALSE and hence
that the corresponding p-values should be uniformly distributed.  One might possibly use it as a
conservative assumption.

Gordon