[BioC] Odd contrast; does it make statistical sense?

Gordon K Smyth smyth at wehi.EDU.AU
Sun Jan 26 02:06:04 CET 2014


On Fri, 24 Jan 2014, Aaron Mackey wrote:

> As always, thanks Gordon for keeping the conversation focused. I guess I 
> was responding to Ryan's statement that the only hypothesis to be tested 
> was a difference between the two main groups,

It is quite clear, from your and Ryan's comments, that this difference is 
not the only scientific question to be answered, and so it cannot be the 
only hypothesis to be tested.

> so the additional modeling of subgroups seems only to reduce the overall 
> residual variance, possibly leading to inflated significance between 
> groups.

I strongly disagree, as I have already told you.  Modeling of subgroups 
that have a strong effect on the outcome is always good science.

> re: "carefully interpreted", would you suggest in such situations that we
> F-test the full model (four subgroups) vs. the nested model (two groups),
> and only perform group-wise comparisons for cases where the nested model
> could not be rejected (i.e. no evidence of interaction effects?)  That
> would seem to fit the ANOVA interpretation, and the usual concerns about
> testing for main effects when interactions may be present.

Would I be willing to give a blunderbuss recommendation that you should 
apply in all situations, regardless of the nature of the groups or the 
scientific questions at issue?  No I wouldn't.

I have been trying to prompt you to clarify what your scientific questions 
actually are.  Once you do do, the appropriate statistical procedure will 
be readily apparent.

I seem to have answered the same question three times now, without getting 
the message across.  I will make one more attempt, but I will reply to 
Ryan's original post, not to this email, because much of Ryan's original 
question and my response has been deleted from the thread below.

Gordon

> Thanks again,
> -Aaron
>
>
> On Fri, Jan 24, 2014 at 1:33 AM, Gordon K Smyth <smyth at wehi.edu.au> wrote:
>
>> On Fri, 24 Jan 2014, Aaron Mackey wrote:
>>
>>  On Thu, Jan 23, 2014 at 6:48 PM, Gordon K Smyth <smyth at wehi.edu.au>
>>> wrote:
>>>
>>>>> My worry is that with this contrast, I'm effectively just testing 
>>>>> two groups against each other, and by having 4 groups in the design 
>>>>> I will be estimating dispersions that are not appropriate for the 
>>>>> test that I'm doing, and hence I will overstate my confidence.
>>>>>
>>>>>
>>>> The dispersions remain unchanged regardless of the contrast you test. The
>>>> dispersions have been estimated after removing all differences between
>>>> the
>>>> four groups, i.e., without bias.
>>>>
>>>>
>>> But had he more simply coded the samples as belonging to two groups, AB
>>> and CD, then the dispersions could be larger, and the AB vs. CD mean
>>> differences could be less significant than in his four-subgroup design. I
>>> think that was the intent of Ryan's question.
>>>
>>
>> Coding that doesn't reflect the true experimental design is likely to
>> perform badly, and give less significance.  That doesn't make it more
>> correct.
>>
>>
>>  Is it fair to stratify along a priori expected subgroupings to minimize
>>> variance and then ask group-level questions?
>>>
>>
>> You are not asking a well posed question.  For one thing, "fair" and
>> "unfair" are unhelpful concepts.  The only consideration is whether the
>> statistical test that is done answers the scientific question being
>> answered.  You haven't explained what scientific question you want to
>> answer, so there is no basis for choosing a scientific test.
>>
>> Fitting a model that matches the experimental conditions and then making
>> comparisons between groups has been the anova method since anova was first
>> invented.  It answers what it answers, as I explained in my response to
>> Ryan.
>>
>> Gordon
>>
>>  As you later say, if there is "lots of DE" between
>>> A vs. B and/or C vs. D (read: larger group-wise variance) then the test
>>> may
>>> need to be "carefully interpreted".
>>>
>>> Thanks for your insight,
>>> -Aaron

______________________________________________________________________
The information in this email is confidential and intend...{{dropped:4}}



More information about the Bioconductor mailing list