[BioC] microarray outlier detection
Gordon K Smyth
smyth at wehi.EDU.AU
Sun Sep 1 03:21:32 CEST 2013
Dear jial2,
As others have pointed out, a sample should only be removed as an outlier
if it is quite clear cut, preferably with some identifiable cause.
It is however possible to allow for some samples being less reliable than
others, and it is usually preferable to do this in a graduated way rather
than complete removal. In my lab, we regularly use the arrayWeights()
function in the limma package to identify and downweight lower quality
microarray samples. The weights just feed into the usual differential
expression analysis. See
http://www.biomedcentral.com/1471-2105/7/261/
and Chapter 14 of the limma User's Guide.
Of course, if your samples don't cluster at all, it may simply be that
your groups are just not systematically different, and no refinement to
the analysis will make them so.
Best wishes
Gordon
> Date: Fri, 30 Aug 2013 13:32:33 -0700 (PDT)
> From: "guest [guest]" <guest at bioconductor.org>
> To: bioconductor at r-project.org, jial2 at mail.nih.gov
> Subject: [BioC] microarray outlier detection
>
>
> Dear users,
>
> I have human gene 2.0 st array, total 12 samples including 4 groups,
> each group has 3 replicates. The lab person would like to remove one
> from each of the group due to the outliers, but from PCA plot, the
> samples are not clustered, it is hard to remove any sample as an
> outlier. I wonder if we have the package or function to solve the
> outlier detection issue on microarray.
>
> Thanks,
>
>
> -- output of sessionInfo():
>
>> sessionInfo()
> R version 3.0.1 (2013-05-16)
> Platform: x86_64-apple-darwin10.8.0 (64-bit)
>
> locale:
> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
>
> attached base packages:
> [1] parallel stats graphics grDevices utils datasets methods base
>
> other attached packages:
> [1] pd.hugene.2.0.st_3.8.0 oligo_1.24.1 oligoClasses_1.22.0 hugene20sttranscriptcluster.db_2.12.1
> [5] org.Hs.eg.db_2.9.0 RSQLite_0.11.4 DBI_0.2-7 AnnotationDbi_1.22.6
> [9] Biobase_2.20.1 BiocGenerics_0.6.0 limma_3.16.6
>
> loaded via a namespace (and not attached):
> [1] affxparser_1.32.3 affyio_1.28.0 BiocInstaller_1.10.3 Biostrings_2.28.0 bit_1.1-10 codetools_0.2-8
> [7] ff_2.2-11 foreach_1.4.1 GenomicRanges_1.12.4 IRanges_1.18.2 iterators_1.0.6 preprocessCore_1.22.0
> [13] splines_3.0.1 stats4_3.0.1 zlibbioc_1.6.0
______________________________________________________________________
The information in this email is confidential and intend...{{dropped:4}}
More information about the Bioconductor
mailing list