[BioC] Best way to normalize GEO gene expression datasets from different labs/sources?

Matthew McCall mccallm at gmail.com
Tue Feb 14 23:17:48 CET 2012


Ying,

You might consider fRMA:
McCall MN, Bolstad BM, and Irizarry RA* (2010). Frozen Robust
Multi-Array Analysis (fRMA), Biostatistics, 11(2):242-253.
http://bioconductor.org/packages/release/bioc/html/frma.html

This preprocessing algorithm was designed to handle such multi-batch analyses.

Best,
Matt

On Tue, Feb 14, 2012 at 4:49 PM, ying chen <ying_chen at live.com> wrote:
>
>
> Hi, I collected dozens of breast cancer GEO datasets (same platform, Affy U133Plus2) and wonder if there is a way to normalize these datasets so I can compare the gene expression levels across all the datasets even though they are from different labs? I think about doing a RMA to all the datasets together first then followed by SVA to correct for batch effect, or doing RMAs dataset by dataset then follwed by mean-scaling. Does any of these make sense? Or what is the best approach? Any suggestion? Thanks a lot for the help! Ying
>        [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor



-- 
Matthew N McCall, PhD
112 Arvine Heights
Rochester, NY 14611
Cell: 202-222-5880



More information about the Bioconductor mailing list