[BioC] question about outlier removal
Weiwei Shi
helprhelp at gmail.com
Thu Sep 28 20:46:06 CEST 2006
On 9/27/06, James Anderson <janderson_net at yahoo.com> wrote:
> Hi,
> This may be a generic question, not necessarily related to the usage of R and bioconductor. The question is: for microarray experiment, suppose I have 50 normal and 50 cancer samples. I want to find some sample outliers which may come from different resources due to:
> 1. Mislabelling, i.e, mislabel cancer into normal or normal into cancer
I think it is not an outlier problem but class noise problem. Google
class noise correction or removal instead and there are some work on
this topic.
> 2. Misbehavior, i.e, some normal samples are actually sick or have heart attack, although they don't have cancer.
If your problem is cancer vs non-cancer one, then again, they should
not be removed either, IMHO.
>
> Should I do gene selection or not before doing outlier removal? Sometimes I find some samples are identified as outliers using 200 genes, other samples will be identified as outliers if I use 50 or 20 genes. Normally in microarray experiment, what is the percentage of genes affected by treatment or unnormal conditions?
>
> Thanks,
> James
>
>
>
> ---------------------------------
> Get your email and more, right on the new Yahoo.com
> [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>
--
Weiwei Shi, Ph.D
Research Scientist
GeneGO, Inc.
"Did you always know?"
"No, I did not. But I believed..."
---Matrix III
More information about the Bioconductor
mailing list