[BioC] Normalized microarray data and meta-analysis

Thu Dec 18 18:20:13 CET 2008

Dear Tom,

That is an interesting point you make (and interesting paper you refer to) but in my view it is not the main aim of a meta-analysis to find the concordance between the individual studies but to summarize these studies in such a way that you have a higher power/sensitivity than any of the individual studies. You could get for example a 100% concordance between studies by not using any statistics but listing the genes in alphabetical order. If you take, say, the top 100 of that list for each study you will get the same genes each time, but unfortunately most of them will be false positives.
Also that BMC Bioinformatics paper doesn't suggest abandoning p-values completely but using them as an additional filtering on the gene list ranked with respect to fold change. So to apply that advice in a meta-analysis one would have to find both some way of coming up with an overall fold change for each gene and an overall p-value for each gene. And the original question remains: would one need to have the raw data for that or is it good enough to have the normalized data or even just summary statistics like average foldchange and p-value for each gene in each study (and my very short answer would be: you do not necessarily need the raw data but they might help!)

Best Wishes

Claus

P.S.: Apologies for mis-using the list for discussion which is not strictly about Bioconductor software, but I guess that meta-analysis will be an issue that many here might be interested in

> -----Original Message-----
> From: bioconductor-bounces at stat.math.ethz.ch [mailto:bioconductor-
> bounces at stat.math.ethz.ch] On Behalf Of Thomas Hampton
> Sent: 18 December 2008 02:57
> To: Paul Leo
> Cc: bioconductor at stat.math.ethz.ch; Mcmahon, Kevin
> Subject: Re: [BioC] Normalized microarray data and meta-analysis
>
> I feel that p-values, corrected or otherwise, may be unsatisfactory for
> detecting concordance between experiments. For example, an experiment
> with
> higher N will show lower p-values for the same gene, even under
> conditions that are otherwise precisely the same. So we can't compare
> p values head to head across multiple experiments directly. Simple
> simulations show
> that straight fold change can be more predictive of future behavior
> (say, in
> somebody else's study) than statistics which place a high premium on
> within-group consistency.
>
> Check this out:
>
> BMC Bioinformatics. 2008; 9(Suppl 9): S10.
> Published online 2008 August 12. doi: 10.1186/1471-2105-9-S9-S10.
> PMCID: PMC2537561
> Copyright (c) 2008 Shi et al; licensee BioMed Central Ltd.
>
> The balance of reproducibility, sensitivity, and specificity of lists
> of differentially expressed genes in microarray studies
>
>
> Cheers
>
> Tom
>
>

The University of Aberdeen is a charity registered in Scotland, No SC013683.