[BioC] Re: missing value zero imputation

S Peri biocperi at yahoo.com
Tue Aug 17 16:43:52 CEST 2004


Thank you for your suggestions Adai.  I am sorry for
being ignorant about the subject thread. I will take
care of that from now. 

regards, 
PS
--- Adaikalavan Ramasamy <ramasamy at cancer.org.uk>
wrote:

> Please use an appropriate subject and not simply
> press reply to another
> thread. See the posting guide at the footnote.
> 
> Please give more information about where does the
> missing value come
> from (flagging, failed spot criterion, computation)
> and what type of
> arrays (affy, cDNA) 
> 
> You do not necessarily zero impute for the following
> reasons :
> 
> 
> 1) There are other better ways of imputing. In the
> following papers, the
> authors showed that k-nn imputing is better than row
> mean and SVD
> imputation.
> 
>  Missing value estimation methods for DNA
> microarrays.  
>  Troyanskaya O., Cantor M., Sherlock G., Brown P.,
> Hastie T., Tibshirani
> R., Botstein D., Altman R.B.  
>  Bioinformatics 2001; 17(6):520-5
>  PMID:11395428
> 
> There are more recent papers on missing value
> imputation for
> microarrays. Try a google or pubmed search and you
> will find many more.
> 
> 
> 2) Most of the functions in R can deal with missing
> values if you set
> the argument na.rm = TRUE. Some do this by default
> (see below) or you
> can easily write a parser.
> 
>  t.test( c(500,502,501, NA, NA), c(400,380,410, NA)
> )$p.value
>   [1] 0.006880871
> 
>  t.test( c(500,502,501, 0, 0), c(400,380,410, 0)
> )$p.value
>   [1] 0.9848868
> 
> This example also shows you an example when some
> imputation can be
> inappropriate.
> 
> 
> 3) Missing values can be informative but this
> depends on how the missing
> values were generated. I commonly filter genes with
> more than 70%
> missing (across arrays) to avoid spurious results.
> Sometimes arrays with
> more than say 50% missing value can be indicative of
> array problems.
> 
> 
> In short, this is depends on how much missing values
> you have, if they
> are informative and what do you want to use them
> for. I tend to impute
> the data only if I plan on using some classification
> method that cannot
> handle missing values. 
> 
> 
> Regards, Adai.
> 
> 
> On Tue, 2004-08-17 at 13:48, S Peri wrote:
> > Dear Group,
> >   In the expression values, if there is N/A do we
> have
> > to convert them to '0' before processing it?
> > If so, how can I convert N/A to '0'?
> > 
> > Thank you. 
> > PS
> > 
> > 
> > 
> > --- Michael Hoffman <hoffman at ebi.ac.uk> wrote:
> > 
> > > On Mon, 16 Aug 2004, James MacDonald wrote:
> > > 
> > > > You can download all the packages you are
> > > interested in at
> > > > www.bioconductor.org, and then install using R
> CMD
> > > install.
> > > 
> > > So I have to install them one by one this way?
> > > There's no distribution
> > > of them all?
> > > 
> > > Thank you,
> > > -- 
> > > Michael Hoffman <hoffman at ebi.ac.uk>
> > > European Bioinformatics Institute
> > > 
> > > _______________________________________________
> > > Bioconductor mailing list
> > > Bioconductor at stat.math.ethz.ch
> > >
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> > >
> > 
> > _______________________________________________
> > Bioconductor mailing list
> > Bioconductor at stat.math.ethz.ch
> > https://stat.ethz.ch/mailman/listinfo/bioconductor
> > 
> 
>



More information about the Bioconductor mailing list