[BioC] Normalized microarray data and meta-analysis
Mayer, Claus-Dieter
c.mayer at abdn.ac.uk
Thu Dec 18 00:42:35 CET 2008
Dear Kevin,
that is a difficult question indeed. I am not sure what type of microarrays we are talking about here, but if it were Affy arrays then normalisation methods like RMA or GCRMA perform an "across array" normalisation step, i.e. the normalised data from the same study will be more similar to each other than the ones from different studies. So for a better comparibility across studies it seems better to normalise the raw arrays from all studies together.
Having said that, even if you are able to do this you will typically find that the data from the different studies cluster together, i.e. the normalisation is not able to remove all the differences between studies. So any proper meta analysis must somehow take into account this study effect (and there is a growing amount of literature how to do that).The importance of having the raw data depends on which approach you take; if you use a p-value comination approach like Stouffers method for example it shouldn't matter much for example, but if you try to put all data into one big analysis it might very well matter.
Best Wishes
Claus
Hello Bioconductor-inos,
I have more of a statistical/philosophical question regarding using raw
vs. normalized data in a microarray meta-analysis. I've looked through
the bioconductor archives and have found some addressing of this issue,
but not exactly what I'm concerned with. I don't mean to waste anyone's
time, but I was hoping I could get some help here.
I've performed a meta-analysis using the downloaded data from 3
different GEO data sets (GDS). It is my understanding that these are
normalized data from the various microarray experiments. Seems to me
that the data from those normalized results are normally distributed,
those three experiments are perfectly comparable (if you think the
author's respective normalization approaches were reasonable). All you
need to do is calculate some sort of effect size/determine a
p-value/etc. for all genes in the experimental conditions of interest
and then combine these statistics across the different experiments.
However, I consistently read things like "raw data are required for a
microarray meta-analysis." Does this mean that normalized data are not
directly comparable with eachother? If so, then why does GEO even host
such data?
Any help would be wonderful!
Wyatt
K. Wyatt McMahon, Ph.D.
Texas Tech University Health Sciences Center
Department of Internal Medicine
3601 4th St.
Lubbock, TX - 79430
806-743-4072
"It's been a good year in the lab when three things work. . . and one of
those is the lights." - Tom Maniatis
[[alternative HTML version deleted]]
_______________________________________________
Bioconductor mailing list
Bioconductor at stat.math.ethz.ch
https://stat.ethz.ch/mailman/listinfo/bioconductor
Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
The University of Aberdeen is a charity registered in Scotland, No SC013683.
More information about the Bioconductor
mailing list