[BioC] loged data or not loged previous to use normalize.quantile

Rhonda DeCook rdecook at iastate.edu
Tue Apr 5 17:51:09 CEST 2005


With respect to permutations tests...

I'm under the impression that you only need independence, not the assumption of 
constant variance.  

The permutation test provides us with a distribution of the test statistic 
under the null hypothesis (equal means in the 2-sample scenario, i.e. all data 
was generated from one distribution-even though it may be an ugly looking 
single distribution).  As long as all 'groupings' of the data into 2 groups are 
equally likely (which is provided by the independence assumption) this 
permutation distribution of the test statistic (e.g. a t-statistic here)gives 
us an idea of the test statistic's distribution under the null without the 
assumption of normality or constant variance.  Computing a permutation p-value 
from this null distribution provides a p-value that has the usual behavior 
under the null, or Uniform(0,1) though in a discrete manner.  When the 
alternative is true, the distribution of the p-value will have more mass near 
zero tha the Uniform(0,1).  

If this logic doesn't apply to the microarray setting, please let me know.

Rhonda





> I just want to remind people that permutation tests, rank tests, etc still 
> require i.i.d. errors.  So the variance needs to be stabilized even  for 
> nonparametric tests.
> 
> --Naomi
> 
> At 01:32 PM 4/4/2005, Fangxin Hong wrote:
> >Hi Marcelo;
> >As what Wolfgang mentioned, non-parametric permutation test is an option
> >when t-distribution assumption is not valid.  But if you have few
> >replications (2-3), most permutation tests don't have power either. I
> >would suggest you try RankProd package, which would be powerful enough to
> >detect differentially expressed genes with 2 replications.
> >
> >Bests;
> >Fangxin
> >
> >
> >
> > > Hi Marcelo,
> > >
> > > the difference is that the power of the test you are doing can be
> > > different when you consider the data on the "raw" or on the
> > > log-transformed scale.
> > >
> > > Also, the p-value calculated by limma is based on the assumption that
> > > the null-distribution of the test statistic is given by a
> > > t-distribution; this assumption might be more or less true in both cases.
> > >
> > > You are really doing two different tests: test A, say, consists of
> > > applying the t-statistic to the untransformed intensities, test B, say,
> > > applying the t-statistic to the transformed intensities.
> > >
> > > Then, if you want to use the t-distribution for getting p-values, you
> > > need to make sure that the null distribution of your test statistic
> > > is indeed (to good enough approximation) t-distributed. You can do this
> > > e.g. by permutations. For that you need either a large number of
> > > replicates, or to pool variance estimators across genes.
> > >
> > > If you don't want to make a parametric assumption for getting p-values,
> > > you need a larger number of replicates; if you have these, you can for
> > > example calculate a permutation p-value.
> > >
> > > So, there is really no "right" or "wrong" about transforming, or which
> > > transformation -- as long as you don't violate the assumptions of the
> > > subsequent tests. If the assumptions are met, then the procedure with
> > > the highest power is preferable. And that depends very much on your data
> > > (about which you have not told us much.)
> > >
> > > Hope that helps.
> > >
> > > And here is another shameless plug: have a look at this paper:
> > > Differential Expression with the Bioconductor Project
> > > http://www.bepress.com/bioconductor/paper7
> > >
> > >    Best wishes
> > >     Wolfgang
> > >
> > > Marcelo Luiz de Laia wrote:
> > >> Dear Bioconductors Friends,
> > >>
> > >> I have a question that I dont found answer for it. Please, if you have a
> > >> paper/article that explain it, please, tell me.
> > >>
> > >> I normalize our data using normalize.quantile function.
> > >>
> > >> If I previous transform our intensities (single channel) in log2, I dont
> > >> get differentially genes in limma.
> > >>
> > >> But, if I dont transform our data, I get some genes with p.value around
> > >> 0.0001, thats is great!
> > >>
> > >> Of course, when I transform the intensities data to log2, I get some NA.
> > >>
> > >> Why are there this difference? Am I wrong in does an analysis with not
> > >> loged data?
> > >>
> > >> Thanks a lot
> > >>
> > >> Marcelo
> > >>
> > >> _______________________________________________
> > >> Bioconductor mailing list
> > >> Bioconductor at stat.math.ethz.ch
> > >> https://stat.ethz.ch/mailman/listinfo/bioconductor
> > >
> > >
> > > --
> > > Best regards
> > >    Wolfgang
> > >
> > > -------------------------------------
> > > Wolfgang Huber
> > > European Bioinformatics Institute
> > > European Molecular Biology Laboratory
> > > Cambridge CB10 1SD
> > > England
> > > Phone: +44 1223 494642
> > > Fax:   +44 1223 494486
> > > Http:  www.ebi.ac.uk/huber
> > >
> > > _______________________________________________
> > > Bioconductor mailing list
> > > Bioconductor at stat.math.ethz.ch
> > > https://stat.ethz.ch/mailman/listinfo/bioconductor
> > >
> > >
> >
> >
> >--
> >Fangxin Hong, Ph.D.
> >Plant Biology Laboratory
> >The Salk Institute
> >10010 N. Torrey Pines Rd.
> >La Jolla, CA 92037
> >E-mail: fhong at salk.edu
> >
> >_______________________________________________
> >Bioconductor mailing list
> >Bioconductor at stat.math.ethz.ch
> >https://stat.ethz.ch/mailman/listinfo/bioconductor
> 
> Naomi S. Altman                                814-865-3791 (voice)
> Associate Professor
> Bioinformatics Consulting Center
> Dept. of Statistics                              814-863-7114 (fax)
> Penn State University                         814-865-1348 (Statistics)
> University Park, PA 16802-2111
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
>



More information about the Bioconductor mailing list