[BioC] simultaneous use of robust and weighting methods in limma
Gordon K Smyth
smyth at wehi.EDU.AU
Fri Dec 13 05:08:22 CET 2013
Hi Richard,
In principle, they can be used in any combination, but the effectiveness
of this awaits careful testing. I would personally be reluctant to use
lmFit(method="robust") with the other methods just because I don't trust
the variance estimators from the MM regression that much.
lmFit(method="robust") is designed to deal with individual expression
values as outliers. arrayWeights() is designed to deal with outlier
arrays. eBayes(robust=TRUE) is designed to deal with outlier
(hypervariable) genes. So the first is observation based, the second is
array based, and the third is gene based. Rather than trying all
combinations, I would be guided by the scientific context and what type of
aberration seems of high risk. Outlier arrays typically arise when RNA
samples vary markedly in quality, and this is common in human clinical
studies when RNA is hard to get. Outlier genes typically arise when a
minority of genes are affected by a hidden covariate or batch effect.
lmFit(method="robust") has been in limma since the earliest days, but it
hasn't been used so much in practice. This may be because microarrays
have a limited dynamic range and so don't tend to show dramatic
single-observation outliers. (RNA-seq may prove to be different.) Or it
might be because the least squares approach on the log-scale is pretty
robust anyway.
Most people might be familiar with robust methods as a way to add
protection against outliers, but array or gene outliers tend to produce
conservative results in the limma pipeline anyway. The major purpose of
arrayWeights and eBayes(robust=TRUE) is to recover statistical power in
the presence of poor data, without having to make ad hoc judgements about
which poorer quality arrays or probes to remove from the analysis.
Best wishes
Gordon
> Date: Wed, 11 Dec 2013 13:14:24 -0500
> From: Richard Friedman <friedman at c2b2.columbia.edu>
> To: "bioconductor at r-project.org list" <bioconductor at r-project.org>
> Subject: [BioC] simultaneous use of robust and weighting methods in
> limma.
>
> Dear List,
>
> Should arrayweights, eBayes(robust=TRUE), and lmFit(...,method="robust")
> be used simulatanenously in Limma? If not should any combination be used
> together?
>
> Thanks and best wishes,
> Rich
> Richard A. Friedman, PhD
> Associate Research Scientist,
> Biomedical Informatics Shared Resource
> Herbert Irving Comprehensive Cancer Center (HICCC)
> Lecturer,
> Department of Biomedical Informatics (DBMI)
> Educational Coordinator,
> Center for Computational Biology and Bioinformatics (C2B2)/
> National Center for Multiscale Analysis of Genomic Networks (MAGNet)/
> Columbia Department of Systems Biology
> Room 824
> Irving Cancer Research Center
> Columbia University
> 1130 St. Nicholas Ave
> New York, NY 10032
> (212)851-4765 (voice)
> friedman at c2b2.columbia.edu
> http://friedman.c2b2.columbia.edu/
>
> In memoriam, Frederik Pohl
______________________________________________________________________
The information in this email is confidential and intend...{{dropped:4}}
More information about the Bioconductor
mailing list