[BioC] outlier detection of fit.li.wong
James W. MacDonald
jmacdon at med.umich.edu
Fri Aug 10 17:36:35 CEST 2007
Hi Yi,
Yi Xing wrote:
> Hi,
>
> I am a little puzzled by the behavior of fit.li.wong function (affy
> package) in conducting outlier detection. I created a matrix
> x <- sweep(matrix(2^rnorm(600),30,20),1,seq(1,2,len=30),FUN="+")
>
> then set x[30,20] as the outlier:
> x[30,20]=9999
>
> When I ran fit.li.wong(x,outlier.detection=TRUE), x[30,20] was
> recognized as an outlier, but apparently it was NOT removed from the
> computation of theta. theta[30] is obviously affected by the single
> outlier.
I don't see an argument for outlier.detection in fit.li.wong. If you
mean remove.outliers, then I believe it is working as advertised.
Using your example:
> all.equal(fit.li.wong(x)$theta, fit.li.wong(x, remove.outliers=F)$theta)
[1] "Mean relative difference: 2.632672"
> cbind(fit.li.wong(x)$theta, fit.li.wong(x, remove.outliers=F)$theta)
[,1] [,2]
[1,] 2.477412 0.7790772
[2,] 2.440048 0.9618730
[3,] 2.283680 0.3710469
[4,] 2.004736 0.3537434
[5,] 2.302030 0.2815720
[6,] 2.368680 0.3209734
[7,] 2.508436 0.6738310
[8,] 2.426458 0.7175141
[9,] 2.397586 0.6339105
[10,] 2.662556 0.6344126
[11,] 2.476010 0.5114757
[12,] 2.495807 0.4771915
[13,] 2.801699 0.5022871
[14,] 2.641723 0.5031644
[15,] 3.178295 0.5674871
[16,] 3.065739 0.3789646
[17,] 2.741703 0.6520351
[18,] 2.799087 0.4864749
[19,] 2.889033 0.6559938
[20,] 2.841164 0.6615991
[21,] 2.825730 0.5604023
[22,] 3.030698 0.7263582
[23,] 2.839171 0.5869337
[24,] 2.751788 1.2154618
[25,] 3.026560 0.8351068
[26,] 3.215382 0.5823551
[27,] 3.051072 0.6278876
[28,] 3.350610 0.5773556
[29,] 3.841350 2.9516553
[30,] 574.199766 2235.8471717
I think there is a difference between what you expect remove.outliers to
do and what it actually does (e.g., remove an outlier from the
computation of theta vs remove an outlier from a dataset and pretend it
never existed).
Best,
Jim
>
> I would like to know how to fix this. Any suggestion is welcome.
>
> Yi Xing
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
--
James W. MacDonald, M.S.
Biostatistician
Affymetrix and cDNA Microarray Core
University of Michigan Cancer Center
1500 E. Medical Center Drive
7410 CCGC
Ann Arbor MI 48109
734-647-5623
More information about the Bioconductor
mailing list