[BioC] loged data or not loged previous to use normalize.quantile
Wolfgang Huber
huber at ebi.ac.uk
Fri Apr 1 23:06:41 CEST 2005
Hi Marcelo,
the difference is that the power of the test you are doing can be
different when you consider the data on the "raw" or on the
log-transformed scale.
Also, the p-value calculated by limma is based on the assumption that
the null-distribution of the test statistic is given by a
t-distribution; this assumption might be more or less true in both cases.
You are really doing two different tests: test A, say, consists of
applying the t-statistic to the untransformed intensities, test B, say,
applying the t-statistic to the transformed intensities.
Then, if you want to use the t-distribution for getting p-values, you
need to make sure that the null distribution of your test statistic
is indeed (to good enough approximation) t-distributed. You can do this
e.g. by permutations. For that you need either a large number of
replicates, or to pool variance estimators across genes.
If you don't want to make a parametric assumption for getting p-values,
you need a larger number of replicates; if you have these, you can for
example calculate a permutation p-value.
So, there is really no "right" or "wrong" about transforming, or which
transformation -- as long as you don't violate the assumptions of the
subsequent tests. If the assumptions are met, then the procedure with
the highest power is preferable. And that depends very much on your data
(about which you have not told us much.)
Hope that helps.
And here is another shameless plug: have a look at this paper:
Differential Expression with the Bioconductor Project
http://www.bepress.com/bioconductor/paper7
Best wishes
Wolfgang
Marcelo Luiz de Laia wrote:
> Dear Bioconductors Friends,
>
> I have a question that I dont found answer for it. Please, if you have a
> paper/article that explain it, please, tell me.
>
> I normalize our data using normalize.quantile function.
>
> If I previous transform our intensities (single channel) in log2, I dont
> get differentially genes in limma.
>
> But, if I dont transform our data, I get some genes with p.value around
> 0.0001, thats is great!
>
> Of course, when I transform the intensities data to log2, I get some NA.
>
> Why are there this difference? Am I wrong in does an analysis with not
> loged data?
>
> Thanks a lot
>
> Marcelo
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
--
Best regards
Wolfgang
-------------------------------------
Wolfgang Huber
European Bioinformatics Institute
European Molecular Biology Laboratory
Cambridge CB10 1SD
England
Phone: +44 1223 494642
Fax: +44 1223 494486
Http: www.ebi.ac.uk/huber
More information about the Bioconductor
mailing list