[BioC] loged data or not loged previous to use normalize.quantile
Rhonda DeCook
rdecook at iastate.edu
Tue Apr 5 17:51:09 CEST 2005
With respect to permutations tests...
I'm under the impression that you only need independence, not the assumption of
constant variance.
The permutation test provides us with a distribution of the test statistic
under the null hypothesis (equal means in the 2-sample scenario, i.e. all data
was generated from one distribution-even though it may be an ugly looking
single distribution). As long as all 'groupings' of the data into 2 groups are
equally likely (which is provided by the independence assumption) this
permutation distribution of the test statistic (e.g. a t-statistic here)gives
us an idea of the test statistic's distribution under the null without the
assumption of normality or constant variance. Computing a permutation p-value
from this null distribution provides a p-value that has the usual behavior
under the null, or Uniform(0,1) though in a discrete manner. When the
alternative is true, the distribution of the p-value will have more mass near
zero tha the Uniform(0,1).
If this logic doesn't apply to the microarray setting, please let me know.
Rhonda
> I just want to remind people that permutation tests, rank tests, etc still
> require i.i.d. errors. So the variance needs to be stabilized even for
> nonparametric tests.
>
> --Naomi
>
> At 01:32 PM 4/4/2005, Fangxin Hong wrote:
> >Hi Marcelo;
> >As what Wolfgang mentioned, non-parametric permutation test is an option
> >when t-distribution assumption is not valid. But if you have few
> >replications (2-3), most permutation tests don't have power either. I
> >would suggest you try RankProd package, which would be powerful enough to
> >detect differentially expressed genes with 2 replications.
> >
> >Bests;
> >Fangxin
> >
> >
> >
> > > Hi Marcelo,
> > >
> > > the difference is that the power of the test you are doing can be
> > > different when you consider the data on the "raw" or on the
> > > log-transformed scale.
> > >
> > > Also, the p-value calculated by limma is based on the assumption that
> > > the null-distribution of the test statistic is given by a
> > > t-distribution; this assumption might be more or less true in both cases.
> > >
> > > You are really doing two different tests: test A, say, consists of
> > > applying the t-statistic to the untransformed intensities, test B, say,
> > > applying the t-statistic to the transformed intensities.
> > >
> > > Then, if you want to use the t-distribution for getting p-values, you
> > > need to make sure that the null distribution of your test statistic
> > > is indeed (to good enough approximation) t-distributed. You can do this
> > > e.g. by permutations. For that you need either a large number of
> > > replicates, or to pool variance estimators across genes.
> > >
> > > If you don't want to make a parametric assumption for getting p-values,
> > > you need a larger number of replicates; if you have these, you can for
> > > example calculate a permutation p-value.
> > >
> > > So, there is really no "right" or "wrong" about transforming, or which
> > > transformation -- as long as you don't violate the assumptions of the
> > > subsequent tests. If the assumptions are met, then the procedure with
> > > the highest power is preferable. And that depends very much on your data
> > > (about which you have not told us much.)
> > >
> > > Hope that helps.
> > >
> > > And here is another shameless plug: have a look at this paper:
> > > Differential Expression with the Bioconductor Project
> > > http://www.bepress.com/bioconductor/paper7
> > >
> > > Best wishes
> > > Wolfgang
> > >
> > > Marcelo Luiz de Laia wrote:
> > >> Dear Bioconductors Friends,
> > >>
> > >> I have a question that I dont found answer for it. Please, if you have a
> > >> paper/article that explain it, please, tell me.
> > >>
> > >> I normalize our data using normalize.quantile function.
> > >>
> > >> If I previous transform our intensities (single channel) in log2, I dont
> > >> get differentially genes in limma.
> > >>
> > >> But, if I dont transform our data, I get some genes with p.value around
> > >> 0.0001, thats is great!
> > >>
> > >> Of course, when I transform the intensities data to log2, I get some NA.
> > >>
> > >> Why are there this difference? Am I wrong in does an analysis with not
> > >> loged data?
> > >>
> > >> Thanks a lot
> > >>
> > >> Marcelo
> > >>
> > >> _______________________________________________
> > >> Bioconductor mailing list
> > >> Bioconductor at stat.math.ethz.ch
> > >> https://stat.ethz.ch/mailman/listinfo/bioconductor
> > >
> > >
> > > --
> > > Best regards
> > > Wolfgang
> > >
> > > -------------------------------------
> > > Wolfgang Huber
> > > European Bioinformatics Institute
> > > European Molecular Biology Laboratory
> > > Cambridge CB10 1SD
> > > England
> > > Phone: +44 1223 494642
> > > Fax: +44 1223 494486
> > > Http: www.ebi.ac.uk/huber
> > >
> > > _______________________________________________
> > > Bioconductor mailing list
> > > Bioconductor at stat.math.ethz.ch
> > > https://stat.ethz.ch/mailman/listinfo/bioconductor
> > >
> > >
> >
> >
> >--
> >Fangxin Hong, Ph.D.
> >Plant Biology Laboratory
> >The Salk Institute
> >10010 N. Torrey Pines Rd.
> >La Jolla, CA 92037
> >E-mail: fhong at salk.edu
> >
> >_______________________________________________
> >Bioconductor mailing list
> >Bioconductor at stat.math.ethz.ch
> >https://stat.ethz.ch/mailman/listinfo/bioconductor
>
> Naomi S. Altman 814-865-3791 (voice)
> Associate Professor
> Bioinformatics Consulting Center
> Dept. of Statistics 814-863-7114 (fax)
> Penn State University 814-865-1348 (Statistics)
> University Park, PA 16802-2111
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
>
More information about the Bioconductor
mailing list