[BioC] GSEAbase and limma

Tue Nov 24 23:10:27 CET 2009

Hi Javier,

It's ok, as long as you repeat the whole eBayes procedure for each 
permutation.  The smoothed standard errors are statistically independent 
of the moderated t-statistics, hence independent of your category 
inference.

You might also consider the roast() and romer() functions which use the 
empirical Bayes statistics explicitly.

Best wishes
Gordon

> Date: Tue, 24 Nov 2009 10:16:46 +0100
> From: Javier P?rez Florido <jpflorido at gmail.com>
> Subject: Re: [BioC] GSEAbase and limma
> To: Sunny Srivastava <research.baba at gmail.com>
> Cc: bioconductor at stat.math.ethz.ch
> Message-ID: <4B0BA47E.8010103 at gmail.com>
> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>
> Dear Sunny,
> Thanks for your reply regarding the use of parametric/nonparametric
> statistical tests.
> What I wanted to mean is the use of a "global" parametric test such
> limma in the context of Gene Set Enrichment useful for finding
> biological themes in gene sets. My question is if limma is suitable when
> building groups of genes since eBayes function employs information from
> ALL genes, rather than individual genes.... :-)
>
> Javier
>
>
> Sunny Srivastava escribi?:
>> Dear Javier,
>> I am pretty sure more experienced member would have a lot and deeper
>> things to say about your question.
>>
>> Here is my 25 cent:
>> Model based statistic (moderated t statistic) and permutation tests
>> are two different flavors of testing the Null Hypothes[es|is].
>> Comparing these two flavors, in my case, will be equivalent to
>> comparing apple and oranges.
>>
>> Each of these methods have their own advantages. If the model suits
>> well - moderated/unmoderated t - statistic should be preferred. If you
>> have no idea of what the model is OR/AND if you are not sure if the
>> model assumptions hold for the data then - permutation test would be a
>> "wiser" (but not necessarily better) choice.
>>
>> A lot can be said to the above discussion - but permutation test will
>> always exist but might not give superior results to what you model
>> based test statistic would give (t-test is quiet robust to assumptions).
>>
>> This should apply to your example as well. You are allowed to used
>> moderated t statitic
>>
>> Please correct if I am wrong. I am also learning my statistics :-)
>>
>> Thanks and Best Regards,
>> S.
>>
>> 2009/11/23 Javier P?rez Florido <jpflorido at gmail.com
>> <mailto:jpflorido at gmail.com>>
>>
>>     Dear list,
>>     I'm new using GSEAbase and I've seen some examples given in
>>     "Bioconductor case studies" book. A data example is given according to
>>     the following steps:
>>
>>        * Nonspecific filtering on expression data object.
>>        * Building the GeneSetCollection using KEGG (for example).
>>        * Compute the per gene test statistics using t-test
>>        * Use of a permutation test to assess which genes have an unusually
>>          large absolute value of the distribution.
>>
>>     My question is: can we use any kind of statistic? For example,
>>     moderated
>>     t-statistic using limma?I know that limma uses the eBayes function,
>>     which employs information from all genes to arrive at more stable
>>     estimates of each individual gene's variance and I don't know if, in
>>     GSEA context, it is correct to use this moderated statistic which
>>     takes
>>     into account all the genes (it is not like the "standard" per gene
>>     statistic t-test).
>>
>>     Thanks,
>>     Javier