[BioC] objective criterion for identification of outlying arrays by pca
Kevin R. Coombes
kevin.r.coombes at gmail.com
Wed Nov 2 16:12:45 CET 2011
The Mahalanobis distance (also known as Hotelling's T^2 statistic) from
the center of a D-dimensional principal component space (under some
sensible null hypothesis) should follow a chi-squared distribution with
D degrees of freedom. You can thus conduct a test for outliers based on
the p-value associated with the chi-squared statistic. (We used this
idea for QC in a serum proteomics study a long time ago: Coombes et al,
Clin Chem 2003; 49:1615-23.)
Kevin
On 11/2/2011 9:11 AM, James W. MacDonald wrote:
> Hi Rich,
>
> On 11/2/2011 10:04 AM, Richard Friedman wrote:
>> Dear Bioconductor List,
>>
>> Does anyone know of an objective criterion for the identification
>> of outlying arrays
>> by pca?
>
> I don't know an objective criterion for this. However, unless the
> 'outlier' is ridiculously bad, you might be better off using array
> weights to down-weight the offending array(s). In limma, the
> arrayWeights() and arrayWeightsSimple() functions allow you to
> generate weights that you can then feed into lmFit().
>
> Best,
>
> Jim
>
>
>>
>> I usually do this subjectively. However the experimental
>> investigator whom I am helping
>> has a different subjective sense than I do, so that I wonder if there
>> is a hard-and-fast criterion.
>>
>> Thanks and best wishes,
>> Rich
>> ------------------------------------------------------------
>> Richard A. Friedman, PhD
>> Associate Research Scientist,
>> Biomedical Informatics Shared Resource
>> Herbert Irving Comprehensive Cancer Center (HICCC)
>> Lecturer,
>> Department of Biomedical Informatics (DBMI)
>> Educational Coordinator,
>> Center for Computational Biology and Bioinformatics (C2B2)/
>> National Center for Multiscale Analysis of Genomic Networks (MAGNet)
>> Room 824
>> Irving Cancer Research Center
>> Columbia University
>> 1130 St. Nicholas Ave
>> New York, NY 10032
>> (212)851-4765 (voice)
>> friedman at cancercenter.columbia.edu
>> http://cancercenter.columbia.edu/~friedman/
>>
>> I am a Bayesian. When I see a multiple-choice question on a test and
>> I don't
>> know the answer I say "eeney-meaney-miney-moe".
>>
>> Rose Friedman, Age 14
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>
More information about the Bioconductor
mailing list